foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
Apache License 2.0
22 stars 41 forks source link

Improve Acceleration Framework Integration #205

Open fabianlim opened 3 months ago

fabianlim commented 3 months ago

Is your feature request related to a problem? Please describe.

@Ssukriti has some suggestions to improve the integration that was completed in #157

Remaining work in subsequent PRs after this PR is merged:

we need to ensure that in CI/CD all the tests run regularly and they are not skipped. That means all dependencuies should be installed for our tests to run regularly . Purpose is to ensure with every release, all tests pass. Unit tests - Additional unit tests added are good, thank you. I did want to ensure model after tuning after GPTQLora is of correct format , and can be loaded and inferred correctly. We have had issues in past, when something would change and model format produced is no longer correct - we should have tests to capture that to have full confidence (will DM about this)

Describe the solution you'd like

To enable the unit tests, we need to enable cuda in the GH workflows. This is because quantized kernels can only run on GPU.

Also we need to maybe make changes to the inference script to incorporate the AccelerationFramework there also

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context about the feature request here.