segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
MIT License
753 stars 44 forks source link

Update transformers dependency #124

Closed carschno closed 3 months ago

carschno commented 3 months ago

The transformers dependency is fixed to version 4.29.2. However, the most recent transformers version is 4.44.0, as of writing this (see Releases for updates).

Downstream projects will require a recent transformer version, so the version should be updated, or un-pinned to allow for compatibility with more recent version.

Is there are a reason for pinning this dependency version?

carschno commented 3 months ago

I see now there is some inconsistency between setup.py and requirements.txt.

bminixhofer commented 3 months ago

requirements.txt is intended to ensure reproducibility of our experiments and benchmark results, while setup.py should allow as many users as possible to install the package. As far as I know, this is a fairly common distinction: https://stackoverflow.com/a/7085000

Having out-of-date versions in the requirements should not be an issue, the installation instructions recommend pip anyway: https://github.com/segment-any-text/wtpsplit?tab=readme-ov-file#installation.

We could add a note to the README to clarify this though.

carschno commented 3 months ago

You are right about the requirements.txt, I missed that. My apologies.

The issue I encountered regarding the transformers dependency was unrelated. I have been looking into it and found out it is about the indirect adapters dependency. With a transformers version >=4.40, I get the following error when I try to install my package (my_package):

$ pip install -e .
[...]
INFO: pip is looking at multiple versions of adapters to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install adapters==0.2.1, my_package==0.0.1 and my_package because these package versions have conflicting dependencies.
    adapters 0.2.1 depends on transformers~=4.39.3
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

However, when I have transformers==3.39.0, this does not seem to be an issue.

The issue itself is hence resolved. I have not been able to figure out why adapters==0.2.2 does not work with wtpsplit, there must be an indirect dependency somewhere in the tree.