Open superway117 opened 3 months ago
Sorry, we only release the pre-trained model currently. You can find the the dataset engine description in Section 3.1 of our paper.
i want to repeat your work on the dataset , appreciate if you could show the demo data of the dataset or provide me the script how to build the dataset
The compiling pipeline is complicated and it's not ready for open source, I could provide some demo data and scripts to request llm for explanation as I got some free time :) Sorry for that
about this point: "Utilizing a dataset engine capable of automatically generating 195 million pairs of code snippets and their descriptions"