In the current implementations 1.x the number of examples (after the creation of contrastive pairs) is computed as:
"logger.info(f" Num examples = {len(train_dataloader)}")" in trainer.py
This means that the result is the number of batches, which only equals the number of examples for batch size=1. This seems at least misleading! In the tutorials (which were apparently prepared for version 0.x) this however seems to be correct.
Could somebody clarify whether this new calculation in the 1.x versions is intended or actually a bug?
In the current implementations 1.x the number of examples (after the creation of contrastive pairs) is computed as: "logger.info(f" Num examples = {len(train_dataloader)}")" in trainer.py
This means that the result is the number of batches, which only equals the number of examples for batch size=1. This seems at least misleading! In the tutorials (which were apparently prepared for version 0.x) this however seems to be correct.
Could somebody clarify whether this new calculation in the 1.x versions is intended or actually a bug?
This behaviour can easily be reproduced by running https://github.com/huggingface/setfit/blob/main/notebooks/text-classification.ipynb with a version 1.x and then 0.x as comparison.
Thanks in advance!