Closed mciniselli closed 2 years ago
Hi, thanks for your interest! We are still in the process to resolve the potential risks of releasing the extra C/C# data collected from BigQuery. For the CodeSearchNet data, we employ all non-valid/test examples for pre-training CodeT5. You can access this data from its official repo.
Hi, thank you for this amazing model! I was wondering if you can share with us the 8.3M methods dataset used for the pretraining.
Thank you very much! Matteo