Fix for 'upload_to_hf_hub()' path mismatch with 'save()'

sanjaydasgupta commented 8 months ago

The code in this PR provides a solution for this issue: upload_to_hf_hub model path mismatch with model.save while also remaining backward compatible with the previous behavior of upload_to_hf_hub.

With this change, upload_to_hf_hub can accept any of the following kinds of path as its model_path argument:

a path to the experiment folder (existing implementation).
a path to the model folder (added by this PR).

The names experiment and model refer to places in the folder hierarchy created when a model is trained:

results / experiment / model / model_weights

The model folder is additionally distinguished by containing the model_hyperparameters.json file. The model_weights folder contains one or more of these files: pytorch_model.bin, adapter_model.bin, adapter_model.safetensors.

The changes to upload_to_hf_hub() cause it to test the model_path argument to determine which kind of path was passed. If the path is a model folder, then the parent path (the corresponding experiment path) is obtained and used instead. So the logic of upload_to_hf_hub and its supporting functions do not have to change at all. ~It uses that knowledge for its own purposes and also passes it down to all supporting functions (as an additional argument model_path_is_experiment_path: bool = True) for their use. The default value of model_path_is_experiment_path has been chosen to ensure the backward compatible behavior.~

~A unit test has been added (test_upload_to_hf_hub__validate_upload_parameters2()) for the new use-case.~ And the feature has been integration tested with custom code on a local build.

github-actions[bot] commented 8 months ago

Unit Test Results

  6 files ±0   6 suites ±0 13m 40s :stopwatch: -45s 12 tests ±0   7 :heavy_check_mark: -   2   5 :zzz: +  2 0 :x: ±0 60 runs ±0 30 :heavy_check_mark: - 12 30 :zzz: +12 0 :x: ±0

Results for commit aa1750ef. ± Comparison against base commit b2cb8b0e.

This pull request skips 2 tests.

``` tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml] tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml] ```

:recycle: This comment has been updated with latest results.

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky, please take a look. Thanks.

sanjaydasgupta commented 8 months ago

@sanjaydasgupta I worry that it may not be safe to change a global constant as it may adversely impact backward compatibility (as in -- a user might rely on it). Is there any way for us in the present PR to constrain the changes in such a way that we do not modify the existing constants but reuse them and define a new one, if needed? Thank you!

I think there is a misunderstanding, let me clarify what I meant:

The global constant MODEL_WEIGHTS_FILE_NAME (for the literal string “model_weights”) existed previously, and continues to be used to denote “model_weights”. I have not changed that. In fact I have replaced “model_weights” with the global constant in some more places in the code.

I added the (previously unused) global constant MODEL_FILE_NAME for the literal string “model”. Most occurrences of “model” in the code have been replaced with the global constant. A few instances of “model” remain — their context was different (e.g. repository type etc.)

My comment about their being poor choices was due to the use of “FILE” instead of “DIRECTORY” or “FOLDER”. Both constants refer to folders, not file-names.

Please let me know if I misunderstood your comment. Thanks.

sanjaydasgupta commented 8 months ago

@sanjaydasgupta You are correct (I just got concerned about a 38-module change!). Please sync with the latest "master" branch, and I will approve. Thank you very much!

Hi @alexsherstinsky this PR is now ready for merge.

ludwig-ai / ludwig

Fix for 'upload_to_hf_hub()' path mismatch with 'save()' #3977

Unit Test Results