chore(deps): update dependency sentence_transformers to v3

This PR contains the following updates:

Package	Change	Age	Adoption	Passing	Confidence
sentence_transformers	`==2.7.0` -> `==3.1.1`

Release Notes

UKPLab/sentence-transformers (sentence_transformers)

### [`v3.1.1`](https://redirect.github.com/UKPLab/sentence-transformers/releases/tag/v3.1.1): - Patch hard negative mining & remove `numpy<2` restriction [Compare Source](https://redirect.github.com/UKPLab/sentence-transformers/compare/v3.1.0...v3.1.1) This patch release fixes hard negatives mining for models that don't automatically normalize their embeddings and it lifts the `numpy<2` restriction that was previously required. Install this version with ```bash ### Full installation: pip install sentence-transformers[train]==3.1.1 ### Inference only: pip install sentence-transformers==3.1.1 ``` #### Hard Negatives Mining Patch ([#2944](https://redirect.github.com/UKPLab/sentence-transformers/issues/2944)) The [`mine_hard_negatives`](https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.mine_hard_negatives) utility introduced in the previous release would fail if `use_faiss=True` & the model does not automatically normalize its embeddings. This release patches that, allowing the utility to work with [all Sentence Transformer models](https://huggingface.co/models?library=sentence-transformers): ```python from sentence_transformers.util import mine_hard_negatives from sentence_transformers import SentenceTransformer from datasets import load_dataset ### Load a Sentence Transformer model model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1").bfloat16() ### Load a dataset to mine hard negatives from dataset = load_dataset("sentence-transformers/natural-questions", split="train[:10000]") print(dataset) """ Dataset({ features: ['query', 'answer'], num_rows: 10000 }) """ ### Mine hard negatives dataset = mine_hard_negatives( dataset=dataset, model=model, range_min=10, range_max=50, max_score=0.8, margin=0.1, num_negatives=5, sampling_strategy="random", batch_size=128, use_faiss=True, ) ''' Batches: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 75/75 [00:21<00:00, 3.51it/s] Batches: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 79/79 [00:03<00:00, 25.77it/s] Querying FAISS index: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 3.98it/s] Metric Positive Negative Difference Count 10,000 47,711 Mean 0.7600 0.5376 0.2299 Median 0.7673 0.5379 0.2274 Std 0.0658 0.0387 0.0629 Min 0.3858 0.3732 0.1044 25% 0.7219 0.5129 0.1833 50% 0.7673 0.5379 0.2274 75% 0.8058 0.5617 0.2724 Max 0.9341 0.7024 0.4780 Skipped 48770 potential negatives (9.56%) due to the margin of 0.1. Could not find enough negatives for 2289 samples (4.58%). Consider adjusting the range_max, range_min, margin and max_score parameters if you'd like to find more valid negatives. ''' print(dataset) ''' Dataset({ features: ['query', 'answer', 'negative'], num_rows: 47711 }) ''' print(dataset[0]) ''' { 'query': 'where is the us navy base in japan located', 'answer': 'United States Fleet Activities Yokosuka The United States Fleet Activities Yokosuka (横須賀海軍施設, Yokosuka kaigunshisetsu) or Commander Fleet Activities Yokosuka (司令官艦隊活動横須賀, Shirei-kan kantai katsudō Yokosuka) is a United States Navy base in Yokosuka, Japan. Its mission is to maintain and operate base facilities for the logistic, recreational, administrative support and service of the U.S. Naval Forces Japan, Seventh Fleet and other operating forces assigned in the Western Pacific. CFAY is the largest strategically important U.S. naval installation in the western Pacific.[1] As of August 2013[update], it was commanded by Captain David Glenister.', 'negative': "2011 Tōhoku earthquake and tsunami The earthquake took place at 14:46 JST (UTC 05:46) around 67\xa0km (42\xa0mi) from the nearest point on Japan's coastline, and initial estimates indicated the tsunami would have taken 10 to 30\xa0minutes to reach the areas first affected, and then areas farther north and south based on the geography of the coastline.[127][128] Just over an hour after the earthquake at 15:55 JST, a tsunami was observed flooding Sendai Airport, which is located near the coast of Miyagi Prefecture,[129][130] with waves sweeping away cars and planes and flooding various buildings as they traveled inland.[131][132] The impact of the tsunami in and around Sendai Airport was filmed by an NHK News helicopter, showing a number of vehicles on local roads trying to escape the approaching wave and being engulfed by it.[133] A 4-metre-high (13\xa0ft) tsunami hit Iwate Prefecture.[134] Wakabayashi Ward in Sendai was also particularly hard hit.[135] At least 101 designated tsunami evacuation sites were hit by the wave.[136]" } ''' dataset.push_to_hub("natural-questions-hard-negatives", "triplet") ``` Thanks to [@omarnj-lab](https://redirect.github.com/omarnj-lab) for pointing out the bug to me. #### Numpy restriction lifted ([#2937](https://redirect.github.com/UKPLab/sentence-transformers/issues/2937)) The [v3.1.0 Sentence Transformers release](https://redirect.github.com/UKPLab/sentence-transformers/releases/tag/v3.1.0) required `numpy<2` to prevent crashes on Windows. However, various third-parties (e.g. scipy) have now been recompiled & released, allowing the Windows tests to pass again. If you experience the following snippet: > A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.0 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. > If you are a user of the module, the easiest solution will be to downgrade to 'numpy<2' or try to upgrade the affected module. We expect that some modules will need time to support NumPy 2. Then consider 1) upgrading the dependency from which the error occurred or 2) downgrading `numpy` to below v2: pip install -U numpy<2 Thanks to [@kozlek](https://redirect.github.com/kozlek) for pointing this out to me and helping getting it resolved. #### All changes - \[`deps`] Attempt to remove numpy restrictions by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2937](https://redirect.github.com/UKPLab/sentence-transformers/pull/2937) - \[`metadata`] Extend pyproject.toml metadata by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2943](https://redirect.github.com/UKPLab/sentence-transformers/pull/2943) - \[`fix`] Ensure that the embeddings from hard negative mining are normalized by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2944](https://redirect.github.com/UKPLab/sentence-transformers/pull/2944) **Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v3.1.0...v3.1.1 ### [`v3.1.0`](https://redirect.github.com/UKPLab/sentence-transformers/releases/tag/v3.1.0): - Hard Negatives Mining utility; new loss function for symmetric tasks; streaming datasets; custom modules [Compare Source](https://redirect.github.com/UKPLab/sentence-transformers/compare/v3.0.1...v3.1.0) This release introduces a [hard negatives mining utility](https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.mine_hard_negatives) to get better models out of your data, a new strong [loss function](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) for symmetric tasks, training with streaming datasets to avoid having to store datasets fully on disk, custom modules to allow for more creativity from model authors, and many bug fixes, small additions and documentation improvements. Install this version with ```bash ### Full installation: pip install sentence-transformers[train]==3.1.0 ### Inference only: pip install sentence-transformers==3.1.0 ``` > \[!WARNING] > Due to incompatibilities with Windows, we have set `numpy<2` in the Sentence Transformers requirements. If you're not on Windows, you can still install `numpy>=2` and everything should work as expected. #### Hard Negatives Mining utility ([#2768](https://redirect.github.com/UKPLab/sentence-transformers/issues/2768), [#2848](https://redirect.github.com/UKPLab/sentence-transformers/issues/2848)) Hard negatives are texts that are rather similar to some anchor text (e.g. a question), but are not the correct match. For example: - Anchor: "are red pandas actually pandas?" - Positive: "Red pandas, like giant pandas, are bamboo eaters native to Asia's high forests. Despite these similarities and their shared name, the two species are not closely related. Red pandas are much smaller than giant pandas and are the only living member of their taxonomic family." - Hard negative: "The giant panda (Ailuropoda melanoleuca; Chinese: 大熊猫; pinyin: dàxióngmāo), also known as the panda bear or simply the panda, is a bear native to south central China." These negatives are more difficult for a model to distinguish from the correct answer, leading to a stronger training signal and a stronger overall model when used with one of the [Loss Functions](https://sbert.net/docs/sentence_transformer/loss_overview.html) that accepts (anchor, positive, negative) pairs such as the one above. This release introduces a utility function called [`mine_hard_negatives`](https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.mine_hard_negatives) that allows you to mine for these hard negatives given a (anchor, positive) dataset (and optionally a corpus of negative candidate texts). It boasts the following features to give you fine-grained control over the similarity of the mined negatives relative to the anchor: - [CrossEncoder](https://sbert.net/docs/quickstart.html#cross-encoder) rescoring for higher quality negative selection. - Skip the top $n$ negative candidates as these might be true positives. - Consider only the top $n$ negative candidates. - Skip negative candidates that are within some `margin` of the true similarity between anchor and positive. - Skip negative candidates whose similarity is larger than some `max_score`. - Two sampling strategies: pick the top negative candidates that satisfy the requirements, or pick them randomly. - FAISS index for searching for negative candidates. - Option to return data as triplets only, or as `2 + num_negatives`-tuples. ```python from sentence_transformers.util import mine_hard_negatives from sentence_transformers import SentenceTransformer from datasets import load_dataset ### Load a Sentence Transformer model model = SentenceTransformer("all-MiniLM-L6-v2") ### Load a dataset to mine hard negatives from dataset = load_dataset("sentence-transformers/natural-questions", split="train") print(dataset) """ Dataset({ features: ['query', 'answer'], num_rows: 100231 }) """ ### Mine hard negatives dataset = mine_hard_negatives( dataset=dataset, model=model, range_min=10, range_max=50, max_score=0.8, margin=0.1, num_negatives=5, sampling_strategy="random", batch_size=128, use_faiss=True, ) ''' Batches: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 588/588 [00:33<00:00, 17.37it/s] Batches: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 784/784 [00:07<00:00, 101.55it/s] Querying FAISS index: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00, 1.06s/it] Metric Positive Negative Difference Count 100,231 460,725 460,725 Mean 0.6866 0.4133 0.2917 Median 0.7010 0.4059 0.2873 Std 0.1125 0.0673 0.1006 Min 0.0303 0.1638 0.1029 25% 0.6221 0.3649 0.2112 50% 0.7010 0.4059 0.2873 75% 0.7667 0.4561 0.3647 Max 0.9584 0.7362 0.7073 Skipped 882722 potential negatives (17.27%) due to the margin of 0.1. Skipped 27 potential negatives (0.00%) due to the maximum score of 0.8. Could not find enough negatives for 40430 samples (8.07%). Consider adjusting the range_max, range_min, margin and max_score parameters if you'd like to find more valid negatives. ''' print(dataset) ''' Dataset({ features: ['query', 'answer', 'negative'], num_rows: 460725 }) ''' print(dataset[0]) ''' { 'query': 'the first person to use the word geography was', 'answer': 'History of geography The history of geography includes many histories of geography which have differed over time and between different cultural and political groups. In more recent developments, geography has become a distinct academic discipline. \'Geography\' derives from the Greek γεωγραφία – geographia,[1] a literal translation of which would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes (276–194 BC). However, there is evidence for recognizable practices of geography, such as cartography (or map-making) prior to the use of the term geography.', 'negative': 'Terminology of the British Isles The word "Great" means "larger", in comparison with Brittany in modern-day France. One historical term for the peninsula in France that largely corresponds to the modern French province is Lesser or Little Britain. That region was settled by many British immigrants during the period of Anglo-Saxon migration into Britain, and named "Little Britain" by them. The French term "Bretagne" now refers to the French "Little Britain", not to the British "Great Britain", which in French is called Grande-Bretagne. In classical times, the Graeco-Roman geographer Ptolemy in his Almagest also called the larger island megale Brettania (great Britain). At that time, it was in contrast to the smaller island of Ireland, which he called mikra Brettania (little Britain).[62] In his later work Geography, Ptolemy refers to Great Britain as Albion and to Ireland as Iwernia. These "new" names were likely to have been the native names for the islands at the time. The earlier names, in contrast, were likely to have been coined before direct contact with local peoples was made.[63]' } ''' dataset.push_to_hub("natural-questions-hard-negatives", "triplet") ``` This dataset can immediately be used in conjunction with [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss), likely resulting in a stronger model than if you had just used the [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) dataset outright. Here are some example datasets that I created using this new function: - https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives - https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives Big thanks to [@ChrisGeishauser](https://redirect.github.com/ChrisGeishauser) and [@ArthurCamara](https://redirect.github.com/ArthurCamara) for assisting with this feature. #### Add CachedMultipleNegativesSymmetricRankingLoss loss function ([#2879](https://redirect.github.com/UKPLab/sentence-transformers/issues/2879)) Let's break this down: - [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) (MNRL): Given (anchor, positive) text pairs or (anchor, positive, negative) text triplets, this loss trains for "Given an anchor (e.g. a query), which text out of a big lineup (all positives and negatives in the batch) is the true positive (e.g. the answer)?". - [MultipleNegativesSymmetricRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) (MNSRL): Adaptation of MNRL that adds a second loss term which means: "Given an positive (e.g. an summary), which text out of a big lineup (all anchors) is the true anchor (e.g. the full article)?". This is useful for symmetric tasks, such as clustering, classification, finding similar texts, and a bit less useful for asymmetric tasks such as question-answer retrieval. - [CachedMultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) (CMNRL): Adaptation of MNRL such that the batch size can be increased to an arbitrary size at a flat 10-20% training speed cost. A higher batch size means a larger lineup for the model to find the true positive in, often resulting in a better training signal and model. The v3.1 Sentence Transformers release now introduces a new loss: [CachedMultipleNegativesSymmetricRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) (CMNSRL), which combines both of the previous adaptations. The result is a loss adept at symmetric training tasks for which you can pick an arbitrarily large batch size. It is likely the strongest loss for Semantic Textual Similarity (STS) tasks in Sentence Transformers now. Big thanks to [@madhavthaker1](https://redirect.github.com/madhavthaker1) for working to include it. #### Streaming Dataset support ([#2792](https://redirect.github.com/UKPLab/sentence-transformers/issues/2792)) The v3.1 release introduces support for training with [`datasets.IterableDataset`](https://huggingface.co/docs/datasets/v2.21.0/en/package_reference/main_classes#datasets.IterableDataset) ([*Differences between Dataset and IterableDataset* docs](https://huggingface.co/docs/datasets/en/about_mapstyle_vs_iterable)). This means that you can train without first downloading the full dataset to disk. For example: ```python from datasets import load_dataset ### Load a streaming dataset to finetune on train_dataset = load_dataset("sentence-transformers/gooaq", split="train", streaming=True) ### IterableDataset({ ### features: ['question', 'answer'], ### n_shards: 2 ### }) ``` or ```python from datasets import IterableDataset, Value, Features def dataset_generator_fn(): ### Gather, fetch, load, or generate data here for ... in ...: yield ... train_dataset = IterableDataset.from_generator(dataset_generator_fn) train_dataset = train_dataset.cast(Features({'question': Value(dtype='string', id=None), 'answer': Value(dtype='string', id=None)})) ``` (*Read more about Dataset features [here](https://huggingface.co/docs/datasets/en/about_dataset_features)*) For a full example of training with a streaming dataset, consider this script: ```python import logging from datasets import load_dataset from sentence_transformers import ( SentenceTransformer, SentenceTransformerTrainer, SentenceTransformerTrainingArguments, SentenceTransformerModelCardData, ) from sentence_transformers.losses import MultipleNegativesRankingLoss from sentence_transformers.training_args import BatchSamplers logging.basicConfig( format="%(asctime)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S", level=logging.INFO ) ### 1. Load a model to finetune with 2. (Optional) model card data model = SentenceTransformer( "microsoft/mpnet-base", model_card_data=SentenceTransformerModelCardData( language="en", license="apache-2.0", model_name="MPNet base trained on GooAQ pairs", ), ) name = "mpnet-base-gooaq-streaming" ### 2. Load a streaming dataset to finetune on train_dataset = load_dataset("sentence-transformers/gooaq", split="train", streaming=True) ### 3. Define a loss function loss = MultipleNegativesRankingLoss(model) ### 4. (Optional) Specify training arguments train_batch_size = 64 args = SentenceTransformerTrainingArguments( ### Required parameter: output_dir=f"models/{name}", ### Optional training parameters: num_train_epochs=1, per_device_train_batch_size=train_batch_size, learning_rate=2e-5, warmup_ratio=0.1, fp16=False, # Set to False if you get an error that your GPU can't run on FP16 bf16=True, # Set to True if you have a GPU that supports BF16 batch_sampler=BatchSamplers.NO_DUPLICATES, # MultipleNegativesRankingLoss benefits from no duplicate samples in a batch ### Optional tracking/debugging parameters: save_strategy="steps", save_steps=100, save_total_limit=2, logging_steps=250, logging_first_step=True, run_name=name, # Will be used in W&B if `wandb` is installed ) ### 5. Create a trainer & train trainer = SentenceTransformerTrainer( model=model, args=args, train_dataset=train_dataset, loss=loss, ) trainer.train() ### 6. Save the trained model model.save_pretrained(f"models/{name}/final") ### 7. (Optional) Push it to the Hugging Face Hub model.push_to_hub(name) ``` #### Advanced: Allow for Custom Modules ([#2773](https://redirect.github.com/UKPLab/sentence-transformers/issues/2773)) Sentence Transformer models consist of several modules that are executed sequentially. Most models consist of a [Transformer](https://sbert.net/docs/package_reference/sentence_transformer/models.html#sentence_transformers.models.Transformer) module, a [Pooling](https://sbert.net/docs/package_reference/sentence_transformer/models.html#sentence_transformers.models.Pooling) module, and perhaps a [Dense](https://sbert.net/docs/package_reference/sentence_transformer/models.html#sentence_transformers.models.Dense) and/or [Normalize](https://sbert.net/docs/package_reference/sentence_transformer/models.html#sentence_transformers.models.Normalize) module. However, as of the v3.1 release, model authors can create their own modules by writing some custom modeling code. This code can be uploaded to the Hugging Face Hub alongside the model itself, after which users can load the model like normal. This allows for authors to replace the `Transformer` module with one that includes model-specific quirks, or replace the `Pooling` module with an all-new pooling method. This even allows for multi-modal models as authors can customize the preprocessing of the first module. [jinaai/jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) is the first model to take advantage of this new feature, allowing you to encode both texts and images (via paths to local images or URLs) due to their custom preprocessing. Try it out yourself: ```python from sentence_transformers import SentenceTransformer ### Load the model; must use trust_remote_code=True to run the custom module model = SentenceTransformer("jinaai/jina-clip-v1", trust_remote_code=True) ### Texts and images of blue and red cats to embed sentences = ['A blue cat', 'A red cat'] image_urls = [ 'https://i.pinimg.com/600x315/21/48/7e/21487e8e0970dd366dafaed6ab25d8d8.jpg', 'https://i.pinimg.com/736x/c9/f2/3e/c9f23e212529f13f19bad5602d84b78b.jpg' ] ### Embed the texts and images like normal text_embeddings = model.encode(sentences) image_embeddings = model.encode(image_urls) ### Compute similarity between text embeddings: print(model.similarity(text_embeddings[0], text_embeddings[1])) ### tensor([[✅0.5636]]) ### or cross-modal text and image embeddings: print(model.similarity(text_embeddings, image_embeddings)) ### tensor([[✅0.2906, ❌0.0569], ### [❌0.1277, ✅0.2916]] ``` Additionally, model authors can take advantage of keyword argument passthrough. By updating the `modules.json` file to include a list of `kwargs`, e.g.: ```json [ { "idx": 0, "name": "0", "path": "", "type": "custom_transformer.CustomTransformer", "kwargs": ["task_type"] }, ... ] ``` then if a user provides the `task_type` keyword argument in `model.encode`, this value will be propagated to the `forward` of the custom module(s). This way, users can specify some custom functionality on the fly during inference time (as well as during load time via the `model_kwargs` option when initializing a `SentenceTransformer` model). #### Update dependency versions ([#2757](https://redirect.github.com/UKPLab/sentence-transformers/issues/2757)) - Restrict `numpy<2.0.0` due to issues with `torch` and `numpy` interoperability on Windows. - Increment minimum `transformers` version to 4.38.0 & `huggingface-hub` to 0.19.3 to prevent a training crash related to the `prefetch_factor` option #### Smaller Highlights ##### Features - Add `show_progress_bar` to [`encode_multi_process`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode_multi_process) ([#2762](https://redirect.github.com/UKPLab/sentence-transformers/issues/2762)) - Add `revision` to [`push_to_hub`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.push_to_hub) ([#2902](https://redirect.github.com/UKPLab/sentence-transformers/issues/2902)) - Add `cache_dir` and `config_args` to CrossEncoder ([#2784](https://redirect.github.com/UKPLab/sentence-transformers/issues/2784)) - Warn users if they might be passing training/evaluation columns in the wrong order, leading to worse training performance ([#2928](https://redirect.github.com/UKPLab/sentence-transformers/issues/2928)) ##### Bug fixes - Prevent crash when encoding an empty list ([#2759](https://redirect.github.com/UKPLab/sentence-transformers/issues/2759)) - Support training with `GISTEmbedLoss` with DataParallel (DP) and DataDistributedParallel (DDP) ([#2772](https://redirect.github.com/UKPLab/sentence-transformers/issues/2772)) - Fix a bug in `GroupByLabelBatchSampler` resulting in some data not being used in training ([#2788](https://redirect.github.com/UKPLab/sentence-transformers/issues/2788)) - Prevent crash if a `datasets` directory exists locally ([#2859](https://redirect.github.com/UKPLab/sentence-transformers/issues/2859)) - Fix `Matryoshka2dLoss` not importing correctly ([#2907](https://redirect.github.com/UKPLab/sentence-transformers/issues/2907)) - Resolve niche training bug with training if using multi-dataset, no-duplicates, and `dataloader_drop_last=True` ([#2877](https://redirect.github.com/UKPLab/sentence-transformers/issues/2877)) - Fix `torch_compile=True` not working in the `SentenceTransformersTrainingArguments`: should now work for faster training ([#2884](https://redirect.github.com/UKPLab/sentence-transformers/issues/2884)) - Fix `SoftmaxLoss` performing worse since v3.0 as a Linear layer was ignored by the optimizer ([#2881](https://redirect.github.com/UKPLab/sentence-transformers/issues/2881)) - Fix `trainer.train(resume_from_checkpoint="...")` with custom models (i.e. `trust_remote_code`) ([#2918](https://redirect.github.com/UKPLab/sentence-transformers/issues/2918)) - Fix the evaluation using the training batch size ([#2847](https://redirect.github.com/UKPLab/sentence-transformers/issues/2847)) - Fix encoding when passing `model_kwargs={"torch_dtype": torch.float16}` with models that use Dense layers ([#2889](https://redirect.github.com/UKPLab/sentence-transformers/issues/2889)) ##### Documentation - New [documentation for batch samplers](https://sbert.net/docs/package_reference/sentence_transformer/sampler.html) ([#2921](https://redirect.github.com/UKPLab/sentence-transformers/issues/2921), various PRs by [@fpgmaas](https://redirect.github.com/fpgmaas)) - New [documentation for custom modules and model structure](https://sbert.net/docs/sentence_transformer/usage/custom_models.html) ([#2773](https://redirect.github.com/UKPLab/sentence-transformers/issues/2773)) #### All changes - \[Typing] make device optional by [@michaelfeil](https://redirect.github.com/michaelfeil) in [https://github.com/UKPLab/sentence-transformers/pull/2731](https://redirect.github.com/UKPLab/sentence-transformers/pull/2731) - \[Spelling] Docs by [@michaelfeil](https://redirect.github.com/michaelfeil) in [https://github.com/UKPLab/sentence-transformers/pull/2733](https://redirect.github.com/UKPLab/sentence-transformers/pull/2733) - \[Spelling] Codespell readme by [@michaelfeil](https://redirect.github.com/michaelfeil) in [https://github.com/UKPLab/sentence-transformers/pull/2736](https://redirect.github.com/UKPLab/sentence-transformers/pull/2736) - \[Spelling] update examples by [@michaelfeil](https://redirect.github.com/michaelfeil) in [https://github.com/UKPLab/sentence-transformers/pull/2734](https://redirect.github.com/UKPLab/sentence-transformers/pull/2734) - \[`versions`] Increment transformers/hf-hub versions to prevent training crash by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2757](https://redirect.github.com/UKPLab/sentence-transformers/pull/2757) - Typo fixed in examples/training/sts/training_stsbenchmark.py by [@akkefa](https://redirect.github.com/akkefa) in [https://github.com/UKPLab/sentence-transformers/pull/2743](https://redirect.github.com/UKPLab/sentence-transformers/pull/2743) - spelling: code comment updates by [@michaelfeil](https://redirect.github.com/michaelfeil) in [https://github.com/UKPLab/sentence-transformers/pull/2735](https://redirect.github.com/UKPLab/sentence-transformers/pull/2735) - Update DenoisingAutoEncoderDataset.py by [@sophia8844](https://redirect.github.com/sophia8844) in [https://github.com/UKPLab/sentence-transformers/pull/2747](https://redirect.github.com/UKPLab/sentence-transformers/pull/2747) - \[`fix`] Prevent crash when encoding empty list by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2759](https://redirect.github.com/UKPLab/sentence-transformers/pull/2759) - Fix syntax warning (issue [#2687](https://redirect.github.com/UKPLab/sentence-transformers/issues/2687)) by [@wyattscarpenter](https://redirect.github.com/wyattscarpenter) in [https://github.com/UKPLab/sentence-transformers/pull/2765](https://redirect.github.com/UKPLab/sentence-transformers/pull/2765) - \[`feat`] Add show_progress_bar to encode_multi_process by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2762](https://redirect.github.com/UKPLab/sentence-transformers/pull/2762) - Typing overload by [@janrito](https://redirect.github.com/janrito) in [https://github.com/UKPLab/sentence-transformers/pull/2763](https://redirect.github.com/UKPLab/sentence-transformers/pull/2763) - \[`fix`] Fix retokenization on DDP/DP with GIST losses by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2775](https://redirect.github.com/UKPLab/sentence-transformers/pull/2775) - Cast predict scores to float before converting to numpy by [@malteos](https://redirect.github.com/malteos) in [https://github.com/UKPLab/sentence-transformers/pull/2783](https://redirect.github.com/UKPLab/sentence-transformers/pull/2783) - Elasticsearch example: simplify setup by [@maxjakob](https://redirect.github.com/maxjakob) in [https://github.com/UKPLab/sentence-transformers/pull/2778](https://redirect.github.com/UKPLab/sentence-transformers/pull/2778) - \[chore] Enable ruff rules `Warning (W)` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2789](https://redirect.github.com/UKPLab/sentence-transformers/pull/2789) - \[fix] Add tests for 3.12 in cicd by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2785](https://redirect.github.com/UKPLab/sentence-transformers/pull/2785) - Allow inheriting the Transformer class by [@mokha](https://redirect.github.com/mokha) in [https://github.com/UKPLab/sentence-transformers/pull/2810](https://redirect.github.com/UKPLab/sentence-transformers/pull/2810) - \[`feat`] Add hard negatives mining utility by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2768](https://redirect.github.com/UKPLab/sentence-transformers/pull/2768) - \[chore] add test for NoDuplicatesBatchSampler by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2795](https://redirect.github.com/UKPLab/sentence-transformers/pull/2795) - \[chore] Add test for RoundrobinBatchSampler by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2798](https://redirect.github.com/UKPLab/sentence-transformers/pull/2798) - \[feat] Improve GroupByLabelBatchSampler by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2788](https://redirect.github.com/UKPLab/sentence-transformers/pull/2788) - \[`chore`] Clean-up `.gitignore` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2799](https://redirect.github.com/UKPLab/sentence-transformers/pull/2799) - \[chore] improve the use of ruff and pre-commit hooks by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2793](https://redirect.github.com/UKPLab/sentence-transformers/pull/2793) - \[feat] Move from `setup.py` and `setup.cfg` to `pyproject.toml` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2786](https://redirect.github.com/UKPLab/sentence-transformers/pull/2786) - \[chore] Add `pytest-cov` and add test coverage command to the Makefile by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2794](https://redirect.github.com/UKPLab/sentence-transformers/pull/2794) - Move `pytest` config to `pyproject.toml` and remove `pytest.ini` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2819](https://redirect.github.com/UKPLab/sentence-transformers/pull/2819) - \[`fix`] Fix packages discovery in `pyproject.toml` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2825](https://redirect.github.com/UKPLab/sentence-transformers/pull/2825) - Fix `ruff` pre-commit hook. by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2826](https://redirect.github.com/UKPLab/sentence-transformers/pull/2826) - \[`chore`] Enable `isort` with `ruff` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2828](https://redirect.github.com/UKPLab/sentence-transformers/pull/2828) - \[`chore`] Enable ruff rules `UP006` and `UP007` to improve type hints. by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2830](https://redirect.github.com/UKPLab/sentence-transformers/pull/2830) - \[`chore`] Enable ruff's pypgrade (`UP`) ruleset by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2834](https://redirect.github.com/UKPLab/sentence-transformers/pull/2834) - update SoftmaxLoss arguments by [@KiLJ4EdeN](https://redirect.github.com/KiLJ4EdeN) in [https://github.com/UKPLab/sentence-transformers/pull/2894](https://redirect.github.com/UKPLab/sentence-transformers/pull/2894) - \[feat] Added revision to push_to_hub argument. by [@pesuchin](https://redirect.github.com/pesuchin) in [https://github.com/UKPLab/sentence-transformers/pull/2902](https://redirect.github.com/UKPLab/sentence-transformers/pull/2902) - Perform additional check for owner string in `is__available` functions by [@leblancfg](https://redirect.github.com/leblancfg) in [https://github.com/UKPLab/sentence-transformers/pull/2859](https://redirect.github.com/UKPLab/sentence-transformers/pull/2859) - \[`style`] Replace Huggingface with Hugging Face by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2905](https://redirect.github.com/UKPLab/sentence-transformers/pull/2905) - Fix typo: "comuptation" -> "computation" by [@jeffwidman](https://redirect.github.com/jeffwidman) in [https://github.com/UKPLab/sentence-transformers/pull/2909](https://redirect.github.com/UKPLab/sentence-transformers/pull/2909) - \[`ci`] Attempt to fix CI disk space issues by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2906](https://redirect.github.com/UKPLab/sentence-transformers/pull/2906) - \[`docs`] Fix typo and broken links in documentation by [@ZiyiXia](https://redirect.github.com/ZiyiXia) in [https://github.com/UKPLab/sentence-transformers/pull/2861](https://redirect.github.com/UKPLab/sentence-transformers/pull/2861) - Add MNSRL with GradCache by [@madhavthaker1](https://redirect.github.com/madhavthaker1) in [https://github.com/UKPLab/sentence-transformers/pull/2879](https://redirect.github.com/UKPLab/sentence-transformers/pull/2879) - Fix 'module object is not callable' error in Matryoshka2dLoss by [@pesuchin](https://redirect.github.com/pesuchin) in [https://github.com/UKPLab/sentence-transformers/pull/2907](https://redirect.github.com/UKPLab/sentence-transformers/pull/2907) - \[`chore`] Add unittests for `InformationRetrievalEvaluator` by [@fpgmaas](https://redirect.github.com/fpgmaas) in [https://github.com/UKPLab/sentence-transformers/pull/2838](https://redirect.github.com/UKPLab/sentence-transformers/pull/2838) - \[`fix`] Safely continue if ProportionalBatchSampler sub-batch sampler throws StopIteration by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2877](https://redirect.github.com/UKPLab/sentence-transformers/pull/2877) - \[`fix`] Fix `torch_compile=True` by always inserting a wrapped model into the loss by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2884](https://redirect.github.com/UKPLab/sentence-transformers/pull/2884) - \[`fix`] Fix SoftmaxLoss by initializing the optimizer over the loss(es) rather than the model by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2881](https://redirect.github.com/UKPLab/sentence-transformers/pull/2881) - \[`fix`] Fix trainer.train(resume_from_checkpoint="...") with custom models (i.e. `trust_remote_code`) by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2918](https://redirect.github.com/UKPLab/sentence-transformers/pull/2918) - \[`docs`] Heavily extend sampler documentation by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2921](https://redirect.github.com/UKPLab/sentence-transformers/pull/2921) - \[`feat`] Add support for streaming datasets by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2792](https://redirect.github.com/UKPLab/sentence-transformers/pull/2792) - \[`fix`] Change eval dataloader to use eval_batch_size by [@akashd-2](https://redirect.github.com/akashd-2) in [https://github.com/UKPLab/sentence-transformers/pull/2847](https://redirect.github.com/UKPLab/sentence-transformers/pull/2847) - \[`feat`] Add cache_dir support to CrossEncoder by [@RoyBA](https://redirect.github.com/RoyBA) in [https://github.com/UKPLab/sentence-transformers/pull/2784](https://redirect.github.com/UKPLab/sentence-transformers/pull/2784) - \[`deprecation`] Push deprecation cycle for `use_auth_token` to v4 by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2926](https://redirect.github.com/UKPLab/sentence-transformers/pull/2926) - \[`security`] Load weights only with torch.load & pytorch_model.bin by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2927](https://redirect.github.com/UKPLab/sentence-transformers/pull/2927) - \[`feat`] Allow loading custom modules; encode kwargs passthrough to modules by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2773](https://redirect.github.com/UKPLab/sentence-transformers/pull/2773) - \[`fix`] Add dtype cast for modules other than Transformer by [@ir2718](https://redirect.github.com/ir2718) in [https://github.com/UKPLab/sentence-transformers/pull/2889](https://redirect.github.com/UKPLab/sentence-transformers/pull/2889) - \[`docs`] Move losses up in the package reference; they're more important by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2929](https://redirect.github.com/UKPLab/sentence-transformers/pull/2929) - \[`feat`] Add column order warnings to the data collator by [@tomaarsen](https://redirect.github.com/tomaarsen) in [https://github.com/UKPLab/sentence-transformers/pull/2928](https://redirect.github.com/UKPLab/sentence-transformers/pull/2928) #### New Contributors - [@akkefa](https://redirect.github.com/akkefa) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2743](https://redirect.github.com/UKPLab/sentence-transformers/pull/2743) - [@sophia8844](https://redirect.github.com/sophia8844) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2747](https://redirect.github.com/UKPLab/sentence-transformers/pull/2747) - [@wyattscarpenter](https://redirect.github.com/wyattscarpenter) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2765](https://redirect.github.com/UKPLab/sentence-transformers/pull/2765) - [@janrito](https://redirect.github.com/janrito) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2763](https://redirect.github.com/UKPLab/sentence-transformers/pull/2763) - [@malteos](https://redirect.github.com/malteos) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2783](https://redirect.github.com/UKPLab/sentence-transformers/pull/2783) - [@fpgmaas](https://redirect.github.com/fpgmaas) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2789](https://redirect.github.com/UKPLab/sentence-transformers/pull/2789) - [@KiLJ4EdeN](https://redirect.github.com/KiLJ4EdeN) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2894](https://redirect.github.com/UKPLab/sentence-transformers/pull/2894) - [@pesuchin](https://redirect.github.com/pesuchin) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2902](https://redirect.github.com/UKPLab/sentence-transformers/pull/2902) - [@leblancfg](https://redirect.github.com/leblancfg) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2859](https://redirect.github.com/UKPLab/sentence-transformers/pull/2859) - [@jeffwidman](https://redirect.github.com/jeffwidman) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2909](https://redirect.github.com/UKPLab/sentence-transformers/pull/2909) - [@ZiyiXia](https://redirect.github.com/ZiyiXia) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2861](https://redirect.github.com/UKPLab/sentence-transformers/pull/2861) - [@madhavthaker1](https://redirect.github.com/madhavthaker1) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2879](https://redirect.github.com/UKPLab/sentence-transformers/pull/2879) - [@akashd-2](https://redirect.github.com/akashd-2) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2847](https://redirect.github.com/UKPLab/sentence-transformers/pull/2847) - [@RoyBA](https://redirect.github.com/RoyBA) made their first contribution in [https://github.com/UKPLab/sentence-transformers/pull/2784](https://redirect.github.com/UKPLab/sentence-transformers/pull/2784) Big thanks to [@fpgmaas](https://redirect.github.com/fpgmaas) for the large number of valuable contributions surrounding tests, CI, config files, and overall project health. **Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v3.0.1...v3.1.0 ### [`v3.0.1`](https://redirect.github.com/UKPLab/sentence-transformers/releases/tag/v3.0.1): - Patch introducing new Trainer features, model card improvements and evaluator fixes [Compare Source](https://redirect.github.com/UKPLab/sentence-transformers/compare/v3.0.0...v3.0.1) This patch release introduces some improvements for the SentenceTransformerTrainer, as well as some updates for the automatic model card generation. It also patches some minor evaluator bugs and a bug with `MatryoshkaLoss`. Lastly, every single Sentence Transformer model can now be saved and loaded with the safer `model.safetensors` files. Install this version with ```bash ### Full installation: pip install sentence-transformers[train]==3.0.1 ### Inference only: pip install sentence-transformers==3.0.1 ``` #### SentenceTransformerTrainer improvements - Implement gradient checkpointing for lower memory usage during training ([#2717](https://redirect.github.com/UKPLab/sentence-transformers/issues/2717)) - Implement support for `push_to_hub=True` Training Argument, also implement `trainer.push_to_hub(...)` ([#2718](https://redirect.github.com/UKPLab/sentence-transformers/issues/2718)) #### Model Cards This patch release improves on the automatically generated model cards in several ways: - Your training datasets are now automatically linked if they're on Hugging Face ([#2711](https://redirect.github.com/UKPLab/sentence-transformers/issues/2711)) - A new `generated_from_trainer` tag is now also added ([#2710](https://redirect.github.com/UKPLab/sentence-transformers/issues/2710)) - The automatically included widget examples are now improved, especially for question-answering. Previously, the widget could give examples of comparing two questions with eachother ([#2713](https://redirect.github.com/UKPLab/sentence-transformers/issues/2713)) - If you save a model locally, then load it again and upload it, it would previously still show ```python ... ### Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") ... ``` This now gets replaced with your new model ID on Hugging Face ([#2714](https://redirect.github.com/UKPLab/sentence-transformers/issues/2714)) - The exact training dataset size is now included in the model metadata, rather than as a bucket of e.g. 1K\

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

[ ] If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

LibrePhotos / librephotos

chore(deps): update dependency sentence_transformers to v3 #1291

Release Notes

Configuration

Quality Gate passed