Save ludwig-config with model-weights in output directory

sanjaydasgupta commented 8 months ago

This PR adds a boolean parameter save_ludwig_config_with_model_weights to the API's train() and experiment() methods that allows the caller to indicate that the user-provided ludwig-configuration should be added to the output directory (along with the model weights). When this parameter is set to True, a file named ludwig_config.json is added to the output directory. The ludwig config will thus be uploaded to HF along with the model weights whenever the API's upload_to_hf_hub() is called. The save_ludwig_config_with_model_weights parameter has a False default, so the feature will not affect pre-existing code or behavior. Enabling users to share or reproduce each other's results is important in many contexts, so publishing the Ludwig config along with a fine-tuned model's weights on HF will be a helpful.

This PR implements this issue.

This feature can be tested by the new integration test def test_ludwig_config_save(save_ludwig_config_with_model_weights, tmp_path) in test_model_training_options.py

github-actions[bot] commented 8 months ago

Unit Test Results

  6 files ±      0   6 suites ±0 13m 40s :stopwatch: - 16m 25s 12 tests - 2 981   7 :heavy_check_mark: - 2 974   5 :zzz: - 7 0 :x: ±0 60 runs - 2 969 30 :heavy_check_mark: - 2 975 30 :zzz: +6 0 :x: ±0

Results for commit 7dcc2b47. ± Comparison against base commit 606c732a.

:recycle: This comment has been updated with latest results.

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky , this submission is now ready for your review.

alexsherstinsky commented 8 months ago

Hi @alexsherstinsky , this submission is now ready for your review.

Hi @sanjaydasgupta -- in order to make a pull request ready for review, please click the button "Ready for review" -- then a number of people will be able to review it and offer their thoughts. Thank you very much!

w4nderlust commented 8 months ago

Sanjay can you help me understand the reson for this? The config is already in the output directory, not sure why you need to add it one more time.

sanjaydasgupta commented 8 months ago

Hi @w4nderlust and @alexsherstinsky, apologies for dropping off abruptly. Something else needed my attention, so I had to take a break from this. I will be back at this by the end of my workday today.

I understand there has been a discussion on the value of saving (to HF, along with the model-weights) the user-provided config as-is, and that it is now desired to save the file named model_hyperparameters.json instead (refer attached screenshot of sample training-session outputs)

While I am happy to run with this decision, allow me to state a few newbie's perspectives:

The name of that file (intended to be containing the configuration) is misleading, and could confuse those that are new to Ludwig's ways. Is giving it a different name an option at all?
If the intent is to allow user's (and experts) to share their successes and/or difficulties would it not be more helpful to pass on the config as provided by the user (which is both more concise, and more likely to be familiar to newcomers)?

Thanks for any insight or additional factors I may have missed.

typical-ludwig-output

alexsherstinsky commented 8 months ago

Hi @sanjaydasgupta -- as per our conversation in Slack, this file can be named "ludwig_config.json". The reason we want the full configuration is that it will enable us to know what changed in case a new Ludwig version introduces changes in the configuration -- so this is the right thing to do, even if it is a larger file. Finally, we want this "ludwig_config.json" to be in the same directory as the other files on HuggingFace (README, adapter_config.json, and the weight shards in the safetensors format). Please let us know if you have further thoughts. Thank you very much for working on this!

sanjaydasgupta commented 8 months ago

@alexsherstinsky, it is now very attractive to go back to the very first proposal we discussed:

Give upload_to_hf_hub(...) an additional parameter upload_config: bool = False that users can set to True to have the Ludwig config deposited to HF along with the model weights.

Since the user has to pass in the location of the model weights (argument model_path), finding the model_hyperparameters.json file is now a trivial matter. We just have to augment upload_to_hf_hub(...) to also upload the config file after doing what it already does now.

I have researched HF, and found that the Hub API also has an upload_file operation. This method also allows in-flight renaming, so naming the file ludwig_config.json on HF is trivial (check the argument path_in_repo).

Please let me know if you have any concerns about this approach. I will proceed to flesh out the implementation after your go ahead.

alexsherstinsky commented 8 months ago

@alexsherstinsky, it is now very attractive to go back to the very first proposal we discussed:

Give upload_to_hf_hub(...) an additional parameter upload_config: bool = False that users can set to True to have the Ludwig config deposited to HF along with the model weights.

Since the user has to pass in the location of the model weights (argument model_path), finding the model_hyperparameters.json file is now a trivial matter. We just have to augment upload_to_hf_hub(...) to also upload the config file after doing what it already does now.

I have researched HF, and found that the Hub API also has an upload_file operation. This method also allows in-flight renaming, so naming the file ludwig_config.json on HF is trivial (check the argument path_in_repo).

Please let me know if you have any concerns about this approach. I will proceed to flesh out the implementation after your go ahead.

Hi @sanjaydasgupta It is great that you found an upload_file() function -- it will enable us to have an elegant solution. The flag is not needed -- we should always upload the configuration when the user calls upload_to_hf(). This is because the current lack of having the full configuration file in the HuggingFace files directory is considered as a gap -- so you are providing a bug fix. Thank you very much!

sanjaydasgupta commented 8 months ago

Thanks @alexsherstinsky, I will start coding the fix soon.

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky my code is now ready for review.

All of the changes are limited inside the function Ludwig.utils.upload_utils.HuggingFaceHub.upload()

There are now two distinct upload operations in that function: upload the model-weights folder, and upload the config file.

I have specialised the commit_message and commit_description (only when they are provided) for the two separate upload operations by suffixing (weights) and (config) respectively. The user should see two separate INFO messages in the success case. The text of the INFO messages have also been specialised to Model weights uploaded to ... and Model config uploaded to ... respectively.

The two uploads could, in theory, fail separately. I reasoned that any partial failure would likely be caused by a transient network condition, so there is no benefit in coding a recovery strategy that kicks in immediately. The process for recovering from any incomplete transfer is manual, and lies entirely with the user. The INFO messages (or lack thereof) are there to help them figure what needs to be done.

I did not find any scripted tests that could be augmented to handle this functionality, but the code has been unit tested successfully in isolation (success case only).

Please let me know if I have missed anything.

sanjaydasgupta commented 8 months ago

@sanjaydasgupta This is great -- I am just requesting type-hints (as part of the overall code quality improvement effort). Also, any thoughts about a test? It looks like we did not have one before, so we would not block this pull request for the test, but it would be great to have them for HF upload in the future. Your ideas are welcome. Thank you!

Hi @alexsherstinsky, I am happy to review this submission for type hints, and will work on a revised version.

A set of tests for the HF interactions could be written too, but I'll need to understand that side of the code first.

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky, this is a quick update that adds type hints only in the code that I changed.

I also added the comment # Upload the ludwig configuration file just above the code that calls upload_file() -- to be consistent with a similar preexisting comment above the part that calls upload_folder().

I will work on the type hints for the module and tests for the HF integration, but please allow me to splice those efforts into my schedule as separate (possibly chunked) pieces. I will discuss my strategy for handling these in the coming week.

sanjaydasgupta commented 8 months ago

The unit test resources, and HF screen shot after running the script ...

upload-unit-test-resources.tar.gz

Had to zip the notebook and the PNG as bare notebooks not supported. Also, the files uploaded just have suitable filename and filetype, but not the right content format. So please do not try to download a model from the repo :-))

sanjaydasgupta commented 8 months ago

@sanjaydasgupta I think that we have an opportunity to add test cases to test_upload_to_hf_hub__validate_upload_parameters in tests/ludwig/utils/test_upload_utils.py so as to recognize the ludwig_config.json file. Please let me know what you think. Thank you very much. P.S.: At some point in the future, we will also want to build a test double for actually simulating the upload to HuggingFace, but for now, let us defer it. Thanks!

Let me take a look

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky please take a look at my updates. Thank you.

sanjaydasgupta commented 8 months ago

Hi @alexsherstinsky please take a look at my updates. Thank you.

I will fix the build errors my morning. Thanks!

ludwig-ai / ludwig

Save ludwig-config with model-weights in output directory #3965

Unit Test Results