recommenders-team / recommenders

Best Practices on Recommendation Systems
https://recommenders-team.github.io/recommenders/intro.html
MIT License
19.12k stars 3.09k forks source link

[BUG] Unable to access MIND dataset due to public access restriction on the storage account #2133

Closed mmahdigh closed 2 months ago

mmahdigh commented 3 months ago

Hi,

I am trying to download the MIND Dataset, but the storage account does not allow public access and returns a 409 error.

I'll attach an example of the error message that occurs on all the download links:

$ curl https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip

<?xml version="1.0" encoding="utf-8"?><Error><Code>PublicAccessNotPermitted</Code><Message>Public access is not permitted on this storage account.
RequestId:2a55c6b5-401e-004d-0626-df374d000000
Time:2024-07-26T06:36:31.8660488Z</Message></Error>%

Many thanks for considering my request!

mmahdigh commented 3 months ago

https://github.com/msnews/MIND/issues/17

fn-hide commented 3 months ago

Me too:

HTTPError: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip

miguelgfierro commented 3 months ago

The tests are failing for the same error: https://github.com/recommenders-team/recommenders/actions/runs/10103307587/job/27940996452

FYI @SimonYansenZhao @anargyri

miguelgfierro commented 3 months ago

@Leavingseason I think the URL of MIND is broken, do you have a copy of the data?

miguelgfierro commented 2 months ago

We found the small version, the large still we don't have it. See #2145

miguelgfierro commented 2 months ago

Update, PR #2145 has the new URL of small and large versions. It should work. We are still reviewing some tests before we merge.

But if you are working with this, the datasets can be found in the new URLs in the PR.