Kaggle / kagglehub

Python library to access Kaggle resources
Apache License 2.0
42 stars 7 forks source link

Download model files in parallel when < X files #122

Closed rosbo closed 2 months ago

rosbo commented 3 months ago

Blocked by: https://github.com/Kaggle/kaggleazure/pull/29608 (@stevemessick is working on the backend fix)

http://b/341160276

rosbo commented 3 months ago

@jeward414 I will add tests on Monday but here is what we did together. I also added the parallel download part.

rosbo commented 2 months ago

Note: integration test is failing until https://github.com/Kaggle/kaggleazure/pull/29608 is merged (@stevemessick is working on the backend fix). I will wait for the change to be deployed before merging this PR.

rosbo commented 2 months ago

I found why the integration tests were failing on CI but not when running locally.

There is a kaggle-api-enable-pagination feature flag for the new ListModelInstanceVersionFilesHandler. It is currently released to 50%. The integrationtester on CI didn't fall into this bucket so it was always getting an empty response when listing files: https://github.com/Kaggle/kaggleazure/blob/dc89139c44f4d0ab33c1331c4326340619b3f6a0/Kaggle.Services.Models/Handlers/ModelApiService/V1/ListModelInstanceVersionFilesHandler.cs#L86

@stevemessick Are you planning to turn the feature flag to public soon? Also, did we need the feature flag for the model handler given it is new?

stevemessick commented 2 months ago

@rosbo That FF was supposed to have been 100% but the update didn't stick. I just verified it is now 100%.

The FF will be removed by https://github.com/Kaggle/kaggleazure/pull/29772.

stevemessick commented 2 months ago

I re-ran the failing check, and it is passing now.

rosbo commented 2 months ago

Thank you @stevemessick 👏