foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
Apache License 2.0
9 stars 30 forks source link

Disallow installing `transformers >= 4.41` #202

Closed kpouget closed 1 week ago

kpouget commented 1 week ago

transformers >= 4.41 appears to create a performance regression in the fms-hf-training fine tuning jobs. This PR prevents the regression to happen in the fms-hf-tuning image by disallowing the installation of transformers >= 4.41, meaning that transformers == 4.40 will be installed.

This fix has been performance tested. It reverts the performance to the level prior to the publication of transformers 4.41

See also: RHOAIENG-8551

Closes #201

fabianlim commented 1 week ago

I feel we should try to understand the root cause, because not allowing updates means not getting new fixes/features also.

kpouget commented 1 week ago

I feel we should try to understand the root cause, because not allowing updates means not getting new fixes/features also.

definitely. This only allows the regression not to propagate this repo for the time being.

fabianlim commented 1 week ago

@kpouget can you share a simple repro script?

astefanutti commented 1 week ago

I feel we should try to understand the root cause, because not allowing updates means not getting new fixes/features also.

Also, without understanding the root cause, we cannot exclude this performance regression actually fixes an incorrect functional behavior.

kpouget commented 1 week ago

@fabianlim I've updated the ticket https://github.com/foundation-model-stack/fms-hf-tuning/issues/201 description to include the steps to reproduce.

kpouget commented 1 week ago

Hello, this PR appears not to work when we build the image, like if the requirements.txt isn't followed. Could you please double check if that's the right way to pin the dependency?

kpouget commented 1 week ago

Wrong place to make this update, should be in pyproject.yaml, this file is no longer used for package dependency versions or installation ack, I close this one