Closed sanjaydasgupta closed 8 months ago
6 files ±0 6 suites ±0 13m 40s :stopwatch: -45s 12 tests ±0 7 :heavy_check_mark: - 2 5 :zzz: + 2 0 :x: ±0 60 runs ±0 30 :heavy_check_mark: - 12 30 :zzz: +12 0 :x: ±0
Results for commit aa1750ef. ± Comparison against base commit b2cb8b0e.
:recycle: This comment has been updated with latest results.
Hi @alexsherstinsky, please take a look. Thanks.
@sanjaydasgupta I worry that it may not be safe to change a global constant as it may adversely impact backward compatibility (as in -- a user might rely on it). Is there any way for us in the present PR to constrain the changes in such a way that we do not modify the existing constants but reuse them and define a new one, if needed? Thank you!
I think there is a misunderstanding, let me clarify what I meant:
The global constant MODEL_WEIGHTS_FILE_NAME (for the literal string “model_weights”) existed previously, and continues to be used to denote “model_weights”. I have not changed that. In fact I have replaced “model_weights” with the global constant in some more places in the code.
I added the (previously unused) global constant MODEL_FILE_NAME for the literal string “model”. Most occurrences of “model” in the code have been replaced with the global constant. A few instances of “model” remain — their context was different (e.g. repository type etc.)
My comment about their being poor choices was due to the use of “FILE” instead of “DIRECTORY” or “FOLDER”. Both constants refer to folders, not file-names.
Please let me know if I misunderstood your comment. Thanks.
@sanjaydasgupta You are correct (I just got concerned about a 38-module change!). Please sync with the latest
"master"
branch, and I will approve. Thank you very much!
Hi @alexsherstinsky this PR is now ready for merge.
The code in this PR provides a solution for this issue: upload_to_hf_hub model path mismatch with model.save while also remaining backward compatible with the previous behavior of upload_to_hf_hub.
With this change,
upload_to_hf_hub
can accept any of the following kinds of path as itsmodel_path
argument:experiment
folder (existing implementation).model
folder (added by this PR).The names
experiment
andmodel
refer to places in the folder hierarchy created when a model is trained:results / experiment / model / model_weights
The
model
folder is additionally distinguished by containing themodel_hyperparameters.json
file. Themodel_weights
folder contains one or more of these files:pytorch_model.bin
,adapter_model.bin
,adapter_model.safetensors
.The changes to
upload_to_hf_hub()
cause it to test themodel_path
argument to determine which kind of path was passed. If the path is amodel
folder, then the parent path (the correspondingexperiment
path) is obtained and used instead. So the logic ofupload_to_hf_hub
and its supporting functions do not have to change at all. ~It uses that knowledge for its own purposes and also passes it down to all supporting functions (as an additional argumentmodel_path_is_experiment_path: bool = True
) for their use. The default value ofmodel_path_is_experiment_path
has been chosen to ensure the backward compatible behavior.~~A unit test has been added (
test_upload_to_hf_hub__validate_upload_parameters2()
) for the new use-case.~ And the feature has been integration tested with custom code on a local build.