teknium1 / GPTeacher

A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
MIT License
1.62k stars 170 forks source link

Create json2markdown.py #7

Closed d3287t328 closed 1 year ago

d3287t328 commented 1 year ago

Proposed refactor helper script to one shot all the json files in the repo into markdown.

teknium1 commented 1 year ago

Would it be a good idea to also create a md version of all the datasets prebuilt + this script?

d3287t328 commented 1 year ago

The only impact is a significantly fewer number of token for each run which means faster startup time on each run and cost savings for anyone using the api.