h2oai / haic-doc-issues-requests

Documentation issues/request for engines, models, and applications in the H2O AI Cloud
0 stars 0 forks source link

[HAIC-APP] Lack of documentation #5

Open physicalit opened 8 months ago

physicalit commented 8 months ago

Documentation issue/request

0 useful information on Quantization. How do i perform it, what settings should i choose for different Quantization types Q8, Q5, (and what would be the difference in the outputted model), etc. No where specifies what kind of model i will be having after fine tuning and how to use it (the normal export is in one format and the one exported to hugging face is in another). What do i do with the result.. how can get as output different Quantization types so that i can run it with ollama and see the differences between them to understand what fits better for my needs?

There is just some high-level information in documentation while in the interface are a lot of options and no explication on what each does.. if you do not have a strong knowledge on LLMs the software is useless and is not like you can easily find this kind of information.. like searching for specific terms and google or bart or gpt4 will know the answers.. there is basically 0 information on this.

Is just to bad that the app looks very nice and can become something grate, but the lack in documentation is daunting.. I would write myself if i knew anything about to use it.. At least some tutorials for training/fine-tuning a pre-trained model and then exporting it to be used with other tools like ollama and obtaining different Quantization types to test, would a minimum that should be evaluated in the documentation..

I understand that this is the hole idea.. to fine-tune a model and get some outputs to use them in different scenarios. Would also be easier for new developers to join and start contributing, what keeps me for being able to help in any way is the lack of documentation.. and understanding of the concepts and how to apply them to get the desired result.

Page details

pascal-pfeiffer commented 8 months ago

Thank you for the request. I agree that the documentation (inside LLM Studio and externally) can be improved. Quantization may not be the most important part as it is only a way to reduce the memory consumption during training.

No where specifies what kind of model i will be having after fine tuning and how to use it (the normal export is in one format and the one exported to hugging face is in another).

Actually, both should be identical. You will get a HF compatible model when clicking on "Download model" bundled together with a model_card.md with demo code to run the model.

What do i do with the result.. how can get as output different Quantization types so that i can run it with ollama and see the differences between them to understand what fits better for my needs?

That would be a great addition to the external Docs indeed. How to run the model on common testbenches or eval studios. But as all exported models are HF models per se, it wouldn't be different to any other model out there.