Kaggle / kagglehub

Python library to access Kaggle resources
Apache License 2.0
42 stars 7 forks source link

Upload for private model yields "No instances available" #103

Closed lewtun closed 3 months ago

lewtun commented 4 months ago

Hello, thank you for making this utility lib for Kaggle!

I'm trying to upload a local model to Kaggle Hub and want it to be private. However, after following the instructions in the README, I am not able to import the model to a Kaggle notebook and instead see a "No instances available" on the notebooks Add input tab:

Screenshot 2024-04-16 at 21 51 49

Steps to reproduce:

import kagglehub

handle = "lewtun/mistral-7b-sft/pyTorch/v1"
local_files = "./mistral-7b-sft/" # Just a fine-tuned Mistral 7B
kagglehub.model_upload(handle, local_files)

It is possible that something is wrong with the variation being set during the upload? Thanks!

lewtun commented 4 months ago

Although my variation doesn't appear on the model page, I notice that if I click on Add new variation, then I can see the variation stored under pyTorch but as a zipped archive instead of the decompressed folder with the model weights (perhaps this is the problem?)

Screenshot 2024-04-16 at 21 56 43

rosbo commented 4 months ago

Thanks for reporting. We are looking into this issue. The archive should be decompressed. This is a problem. Stay tuned.

lewtun commented 4 months ago

OK it seems the archive does eventually decompress, but for some reason it only includes a subset of the files (e.g. just one of the weights) and seems to have merged the remainder into special_tokens_map.json (this file is 5GB when it should be 500 bytes)

Screenshot 2024-04-19 at 13 18 03

CausalTruth commented 4 months ago

I have the same issue. Only one file gets zipped and uploaded, not the complete directory.

lewtun commented 3 months ago

These issues now seem to be fixed in kagglehub==0.2.4, especially now that files are pushed individually!

The only caveat that it takes >1h to process the files after upload, but eventually the variation appears on the model page.

rosbo commented 3 months ago

Glad to hear to that the issue you faced is now fixed. We are now focused on optimizing the backend processing time for large models.