instructlab / instructlab

InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.
https://instructlab.ai
Apache License 2.0
764 stars 290 forks source link

Add commands to list the effective (non)persistent storage paths to stdout #1977

Open booxter opened 1 month ago

booxter commented 1 month ago

Is your feature request related to a problem? Please describe.

With Persisted Storage change, files are now stored in different places, and the paths are no longer relative to the current dir. (.cache, .config etc.) These paths are also platform dependent (e.g. Mac users store their cached models under Application Support directory; in the future, windows users may have their own storage paths.)

Describe the solution you'd like

It would be nice if we could instruct users to e.g. list model files with:

$ find $(ilab config dir downloads)

or list trained models with:

$ ls $(ilab config dir models)

(The exact names of options for each persisted storage path would be subject to discussion. The examples above are just for illustration purposes.)

Additional context

Some of the use cases above could be handled with dedicated commands, e.g. we've added model list and data list lately. Still, I think it should be easy for users to retrieve the platform specific path used in scripts and elsewhere too.

booxter commented 1 month ago

There are suggestions to add the directories to sysinfo output that I think may be relevant here: https://github.com/instructlab/instructlab/issues/1858 With the latter fixed, I think it would at least be possible to extract directories programmatically, but it would require parsing the output of sysinfo (which is probably not something we'd like to advertise - later changes in message format, or translations, could affect the parsing.) Mentioning here for completeness.

nathan-weinberg commented 1 month ago

I agree!