b4rtaz / distributed-llama

Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
MIT License
1.02k stars 68 forks source link

[Feature Suggest] Config File alternative to Command Line Arguments #71

Closed DifferentialityDevelopment closed 1 month ago

DifferentialityDevelopment commented 1 month ago

I have an idea to make the program a bit more user-friendly. Right now, we have to pass all the settings through command-line arguments, which can be a hassle. How about we add support for a configuration file that the program can use by default if no command-line arguments are provided? Plus, it would be great if we could specify which config file to use.

Default Config File: The program should check for a default config file (like config.json, config.json, or settings.ini) in the program directory. If there are no command-line arguments, the program should automatically use the settings from this config file.

Custom Config File: Add a new command-line option (like --config ) that lets users specify a different config file. If this option is used, the program should load settings from the specified config file instead of the default one.

With the addition of dllama-api we have the ability to easily parse JSON files, what do you think?

unclemusclez commented 1 month ago

i agree with this but atm this seems like not a huge priority if variables are going to be changing or added. bash script might be the answer, or maybe a script to generate a start-script.

b4rtaz commented 1 month ago

To be honest, I don't like any configuration files. The configuration file distracts from the correctly designed CLI. Also, in Docker/K8s there is a shift to environment variables and CLI arguments. So I think this project should not go into different direction.

I understand that, now there are too many arguments to run anything. Last time I changed the converter to store the float type in the model. So soon --weights-float-type will be depreciated.

Additionaly the download-model.py generator creates a run_x.sh script that allows to run the model by a single command: bash run_x.sh.

Maybe this project should also have short argument names (-b --buffer-float-type). IMO this is a better direction.