mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.62k stars 560 forks source link

Access to Rclone Download Instructions for llama2_70b_lora #769

Closed nnasirinvidia closed 2 weeks ago

nnasirinvidia commented 1 month ago

I am having issues downloading the model and data for LoRA fnetuning. Initial steps went well but I am not still given access to the data yet.

rclone copy mlc-llama2:Llama2-70b-fused-qkv-mlperf ./Llama2-70b-fused-qkv-mlperf -P 2024/10/10 21:14:01 ERROR : Google drive root 'Llama2-70b-fused-qkv-mlperf': error reading source root directory: directory not found 2024/10/10 21:14:01 ERROR : Attempt 1/3 failed with 1 errors and: directory not found 2024/10/10 21:14:01 ERROR : Google drive root 'Llama2-70b-fused-qkv-mlperf': error reading source root directory: directory not found 2024/10/10 21:14:01 ERROR : Attempt 2/3 failed with 1 errors and: directory not found 2024/10/10 21:14:01 ERROR : Google drive root 'Llama2-70b-fused-qkv-mlperf': error reading source root directory: directory not found 2024/10/10 21:14:01 ERROR : Attempt 3/3 failed with 1 errors and: directory not found Transferred: 0 B / 0 B, -, 0 B/s, ETA - Errors: 1 (retrying may help) Elapsed time: 1.8s 2024/10/10 21:14:01 Failed to copy: directory not found

When I run rclone ls mlc-llama2:, I see these two files: 12666 llm_inference_post_process_script/llm_inference_post_processing.ipynb 21350 llm_inference_post_process_script/analyze_gpt_result_summary.ipynb

nnasirinvidia commented 1 month ago

Any update on this issue?

ShriyaPalsamudram commented 1 month ago

@nathanw-mlc is this something you could help with?

nathanw-mlc commented 1 month ago

@nnasirinvidia Did you make sure you are running rclone v1.6x.x? You can check with rclone version.

nnasirinvidia commented 1 month ago

Thanks for your help. I am using the right version of rclone based on the output of this command: rclone version rclone v1.67.0

nathanw-mlc commented 1 month ago

@nnasirinvidia Thanks for confirming that. Would you please email systems@mlcommons.org with the email address you used to agree to the confidentiality agreement?

mahmoodn commented 1 month ago

I have a similar problem with rclone and llama-2 model. The readme page doesn't mentioned how to create config and authenticate. It only says the cope command.

nathanw-mlc commented 1 month ago

I have a similar problem with rclone and llama-2 model. The readme page doesn't mentioned how to create config and authenticate. It only says the cope command.

The ReadMe file in this repo intentionally does not include the full download instructions. You have to fill out the confidentiality notice, at which point you will receive a link to the download location and instructions.

nathanw-mlc commented 1 month ago

@nnasirinvidia

It seems that something is wrong with your Rclone remote configuration. The two file paths output when you ran rclone ls mlc-llama2 aren't even in the download location. I was able to recreate your errors by misconfiguring the rclone remote.

When you authenticated Rclone with Google Drive, did you make sure to configure the remote as not a Shared Drive?

Please try deleting the Rclone remote (rclone config delete mlc-llama2) and recreating it following the instructions in the CLI Download Instructions file.

mahmoodn commented 1 month ago

The ReadMe file in this repo intentionally does not include the full download instructions. You have to fill out the confidentiality notice, at which point you will receive a link to the download location and instructions.

@nathanw-mlc I have already send an email to the MLC support as the confidentiality link (https://sites.google.com/view/mlcommons-llama2?pli=1) doesn't work and I see Google's 404 error on the page.

nathanw-mlc commented 1 month ago

I have already send an email to the MLC support as the confidentiality link (https://sites.google.com/view/mlcommons-llama2?pli=1) doesn't work and I see Google's 404 error on the page.

@mahmoodn have you associated a Google account with the email address you used to join MLCommons?

mahmoodn commented 1 month ago

@mahmoodn have you associated a Google account with the email address you used to join MLCommons?

While I am hijacking someone else topic, but I have to say, the conditions written on the readme page are not clear. The registration form doesn't ask for Google's email, but only asks for institutional email. The link you provided redirects to creating a new Gmail address. I have one, but don't know how to tell that. Maybe it's better to send an email to the support specifically about this and ask for linking my institutional with Gmail address.

nnasirinvidia commented 1 month ago

@nathanw-mlc I had used my corporate email not gmail during authentication.

nathanw-mlc commented 1 month ago

The link you provided redirects to creating a new Gmail address.

@mahmoodn The link I provided is for creating a Google account associated with your institutional email address, not for creating a new Gmail address. As you can see in the image below, it asks you to enter your extant email address. Creating a new Gmail address is an option that's listed, but it's not the default flow nor what we want. Because we use Google tooling, you have to associate a Google account (not a Gmail address) with your institutional email address or else Google will not be able to authenticate your institutional email address with our resources. To be clear, we want your institutional email address, not a Gmail address. We list these steps on the community page of our website, as well as every working group page, but this process is admittedly confusing, partially because most people don't know that you can create a Google account with an already extant non-Google email address rather than creating a new Gmail address.

image

nathanw-mlc commented 1 month ago

@nathanw-mlc I had used my corporate email not gmail during authentication.

@nnasirinvidia the problem you're facing is completely separate from that of @mahmoodn. You're all good on the confidentiality form front. Something is just wrong with your Rclone remote config. Please see my previous comment about it.

mahmoodn commented 1 month ago

@nathanw-mlc Thanks I think that step has been complete.