rizwan09 / REDCODER

Other
40 stars 8 forks source link

How to prepare the data folder: ../redcoder_data #1

Closed fYYw closed 2 years ago

fYYw commented 2 years ago

Dear authors,

Many scripts in the project require the data folder ../redcoder_data. However, this folder is missing from the repo. Could you please provide instructions about how to prepare the redcoder_data folder?

Thanks!

rizwan09 commented 2 years ago

Hi Fan, Thanks for your interest. The data I planned to release long ago. But it's actually large together. Could you please let me which ones you need right away? Then I can upload it first and then you can let me know the other one. Btw, the Codexglue-java and codex-glue python are the same as in Codexglue. Thanks

fYYw commented 2 years ago

Thanks!
Some of the data I think I figured out, but would also like to know if my understanding is correct.

For SCODE-R: codexglue_csnet : I downloaded the data from PLBART. concode : I downloaded the data from PLBART. graphcodebert-base : Noted in this repo. retrieval_database : Noted in this repo. For SCODE-G: retriever_output/codexglue_csnet_text_to_code : Is this the output file of the SCODE-R step 3? codexglue_csnet_text_to_code_scode-g-preprocessed-input : this is from SCODE-G step 1

Also, could you please share a training log? I am able to run SCODE-R step 1 and observe the validation average rank went from 4.3 to 3.2. Is this expected?

Thank you!

rizwan09 commented 2 years ago

Hi, For SCODE-R: codexglue_csnet : Great! concode : Great! graphcodebert-base : Huggingface link retrieval_database : Google drive link For SCODE-G: retriever_output/codexglue_csnet_text_to_code : yes codexglue_csnet_text_to_code_scode-g-preprocessed-input : yes

Also, could you please share a training log? I am able to run SCODE-R step 1 and observe the validation average rank went from 4.3 to 3.2. Is this expected? Awesome. I think this is ok. I will check if I still have the log and maybe share it soon.

rizwan09 commented 2 years ago

Give a bit of time for the google drive link. I am currently uploading the database for codexglue-text-to-code

fYYw commented 2 years ago

Thanks much! Closing the issue as all my questions are answered.

polangjushi commented 1 year ago

Dear authors, As for the data related to part SCODE-G, I only found the test file ‘python_csnet_code_text_retrieval_dedup_valid_30.json’ on Google Web Disk, but no train or valid files were found. I don't know where to download them.