e-p-armstrong / augmentoolkit

Convert Compute And Books Into Instruct-Tuning Datasets
MIT License
584 stars 79 forks source link

Gradio Web UI + Extended Input Folder #16

Closed cocktailpeanut closed 1 month ago

cocktailpeanut commented 1 month ago
  1. Added a gradio web ui that lets you load and edit the config.yaml file with app.py
  2. The app.py file saves the config.yaml file and calls the processing.py file when triggered
  3. The stdout is streamed to the log view in the gradio ui

Also,

  1. Support nested raw_txt_input folder. Can have nested folders and it will just pick them all up
  2. Support .md extension files (a lot of documentation files are .md files, such as README.md for github)
darkacorn commented 1 month ago

i dont really see the need for it .. as yaml is already easy to edit tbh ?!

either they run it in juypter - and they have access to the yaml or they run it raw system .. a second web process makes very little sense to me - please elaborate your thought

revolvedai commented 1 month ago

i dont really see the need for it .. as yaml is already easy to edit tbh ?!

either they run it in juypter - and they have access to the yaml or they run it raw system .. a second web process makes very little sense to me - please elaborate your thought

Related to cocktailpeanut post here: https://twitter.com/cocktailpeanut/status/1791163779624943728

darkacorn commented 1 month ago

that does not explain how thats important ? .. i cant see the usecase for it as its already trivial

revolvedai commented 1 month ago

It will rapidly increase adoption and awareness for Augmentoolkit with the Pinokio platform.

The Open source LLM space has a UI problem, Juypter and Terminal only apps are part of that problem. Making a Gradio front end makes it more accessible to users who don't know the difference between vim and emacs or tabs and spaces. or care.

cocktailpeanut commented 1 month ago

Hey guys i'm ok if you guys don't need this in the project. If it feels like a bloat to have a webui in this repo, I could create a separate project that makes use of augmentoolkit, I didn't want to do it because I thought this was just a little feature that's useful, but if this is not aligned with the direction of the project, I totally respect that and will create a separate project. I guess it will be kind of like kohya web ui where there is a core engine (kohya scripts) and a separate webui that simply makes use of the script, which I think could be another way to do this.

Anyway just want to clarify that I am super appreciative of this project and just appreciate this project exists and just wanted to make it easier to use. So, even if this doesn't get merged, i'll 100% respect the decision and find another way to make sure augmentoolkit itself gets adoption (without messing up this repository) simply because it's such a cool project. Thanks!

darkacorn commented 1 month ago

lets wait what evan says - i merely asked for clarification

e-p-armstrong commented 1 month ago

I think this is a great addition! Part of Augmentoolkit is about democratizing data generation, and making it more usable for people. I know that YAML seems easy to us programmers but frankly most people don't even know what a plaintext editor is, so this will be really helpful to a lot of people.

I'll look at merging this (want to quickly review the code first; also we might need to collaborate a bit about updating the README to include this). Thanks a ton @cocktailpeanut for the badass PR! I seriously appreciate you helping Augmentoolkit get out there, I'm happy to have your contribution.

e-p-armstrong commented 1 month ago

Looks good! Thanks for believing in the project! I seriously appreciate contributions like this.