Kaggle / kagglehub

Python library to access Kaggle resources
Apache License 2.0
42 stars 7 forks source link

Add integration test for downloading public model unauthenticated #128

Closed rosbo closed 2 months ago

rosbo commented 2 months ago

We started using the ListModelInstanceVersionFiles in #122 (not yet released but merged). However, I noticed this doesn't work for unauthenticated users.

I added an integration test to ensure we don't regress on this once the backend is fixed.

http://b/341160276

rosbo commented 2 months ago

@stevemessick, the list API is now called by unauthenticated users. However, the API fails. This PR adds an integration to ensure we support this case. However, we will need to update the backend to properly support this.

Similar to the ModelApiService#DownloadModelInstanceVersion RPC, we should allow unauthenticated users: https://github.com/Kaggle/kaggleazure/blob/dc89139c44f4d0ab33c1331c4326340619b3f6a0/Kaggle.Sdk/models/model_api_service.proto#L81-L82

Given it returns files inside a databundle, we should make sure to checks the modelInstanceVersions.use (should fail for users missing consent) permission: https://github.com/Kaggle/kaggleazure/blob/dc89139c44f4d0ab33c1331c4326340619b3f6a0/Kaggle.Sdk/models/model_service.proto#L642C11-L642C33

And then, we should add a unit test to ensure unauthenticated users are allowed to download from the API: https://github.com/Kaggle/kaggleazure/pull/26241/files

I re-opened the bug and added the context above in the comment: https://b.corp.google.com/issues/341771805#comment3

rosbo commented 2 months ago

FYI @neshdev, @mohami2000, @jplotts, @jeward414 let's not release a new kagglehub version until the ListModelInstanceVersionFilesHandler is fixed.

stevemessick commented 2 months ago

Created http://b/343982117

rosbo commented 2 months ago

@stevemessick fixed the issue in https://github.com/Kaggle/kaggleazure/pull/29811 and the integration test is now passing 😄