shreyaskarnik / DistiLlama

Chrome Extension to Summarize or Chat with Web Pages/Local Documents Using locally running LLMs. Keep all of your data and conversations private. 🔐
MIT License
273 stars 25 forks source link

Bump @xenova/transformers from 2.16.1 to 2.17.1 #178

Closed dependabot[bot] closed 4 months ago

dependabot[bot] commented 4 months ago

Bumps @xenova/transformers from 2.16.1 to 2.17.1.

Release notes

Sourced from @​xenova/transformers's releases.

2.17.1

What's new?

Full Changelog: https://github.com/xenova/transformers.js/compare/2.17.0...2.17.1

2.17.0

What's new?

💬 Improved text-generation pipeline for conversational models

This version adds support for passing an array of chat messages (with "role" and "content" properties) to the text-generation pipeline (PR). Check out the list of supported models here.

Example: Chat with Xenova/Qwen1.5-0.5B-Chat.

import { pipeline } from '@xenova/transformers';

// Create text-generation pipeline const generator = await pipeline('text-generation', 'Xenova/Qwen1.5-0.5B-Chat');

// Define the list of messages const messages = [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Tell me a funny joke.' } ]

// Generate text const output = await generator(messages, { max_new_tokens: 128, do_sample: false, }) console.log(output[0].generated_text); // [ // { role: 'system', content: 'You are a helpful assistant.' }, // { role: 'user', content: 'Tell me a funny joke.' }, // { role: 'assistant', content: "Sure, here's one:\n\nWhy was the math book sad?\n\nBecause it had too many problems.\n\nI hope you found that joke amusing! Do you have any other questions or topics you'd like to discuss?" }, // ]

We also added the return_full_text parameter, which means if you set return_full_text=false, only the newly-generated tokens will be returned (only applicable if passing the raw text prompt to the pipeline).

🔢 Binary embedding quantization support

Transformers.js v2.17 adds two new parameters to the feature-extraction pipeline ("quantize" and "precision"), enabling you to generate binary embeddings. These can be used with certain embedding models to shrink the size of the document embeddings for retrieval. This results in reductions in index size/memory usage (for storage) and improvements in retrieval speed. Surprisingly, you can still achieve up to ~95% of the original performance, but at 32x storage savings and up to 32x retrieval speeds! 🤯 Thanks to @​jonathanpv for this addition in xenova/transformers.js#691!

import { pipeline } from '@xenova/transformers';
</tr></table> 

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)