Closed KeijiBranshi closed 1 day ago
Note: Links to docs will display an error until the docs builds have been completed.
There are 1 currently active SEVs. If your PR is affected, please view them below:
As of commit 00493ef2b493a206b46fa67632962f9e7dce8b99 with merge base d9b6c792f23c7e35dcb4a3a29fee0f30f54f4894 (): :green_heart: Looks good so far! There are no failures yet. :green_heart:
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Awesome stuff!! Will be reviewing this today or tomorrow :)
Will wait for CI, but LGTM!
Thanks @KeijiBranshi for your hard work on this and making it easier for torchtune users to download models from the Kaggle Hub!
Thanks so much for your help and review! Really appreciated your guidance on everything.
Looks like some of my test are failing. Could've sworn I ran before pushing, but i guess it slipped through. Resolving those now π
@joecummings success!
Thanks again for all your help. Let me know if anything else needs to be done for the release process π
@joecummings quick fyi, I just rebased to the most recent main
branch commit.
Let us know if there's anything else we should do to get this merged π
@joecummings quick fyi, I just rebased to the most recent
main
branch commit.Let us know if there's anything else we should do to get this merged π
Thanks for the rebase. Will merge this today!! Afterwards, I'll kick off a "nightly" build so users can have immediate access to this feature even before tomorrow's build of torchtune.
@joecummings quick fyi, I just rebased to the most recent
main
branch commit. Let us know if there's anything else we should do to get this merged πThanks for the rebase. Will merge this today!! Afterwards, I'll kick off a "nightly" build so users can have immediate access to this feature even before tomorrow's build of torchtune.
Thank you so much!! Looking forward to it.
Please ping me if things break or users see issues π. If I'm unresponsive for any reason, you can also email kaggle-models@google.com.
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
https://github.com/pytorch/torchtune/issues/1852
Changelog
What are the changes made in this PR?
--source
,--kaggle-username
, and--kaggle-api-key
--source
value (only valid options arehuggingface
andkaggle
). When--source
is not provided, defaults tohuggingface
_download_from_kaggle()
_download_from_huggingface()
with a few notable exceptions:kagglehub
requires the caller's username and api token as separate arguments tokagglehub.set_kaggle_credentials()
(these can also be set as environment variables). Since both are required, appropriate error messages are raised when one is omitted. However, since Kaggle supports anonymous downloads on public non-gated models, the option to omit both is also supported.kagglehub.model_download
are different than those raised byhuggingface_hub.snapshot_download
.kagglehub.model_download
doesn't support direct download to a caller-specified directory via--output-dir
. This feature is being worked on inkagglehub
(see https://github.com/Kaggle/kagglehub/issues/179), but in the meantime a warning was just placed here instead._validate_kaggle_model_handle
) as per the requests made here. This validation is primarily used to raise warnings when downloads may not be guaranteed to work with torchtune.--source kaggle
is providedTest plan
Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.
pre-commit install
)pytest tests
run recipe tests viapytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctnessinclude relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it. Here is a docstring example and a tutorial example