Open jjasghar opened 1 month ago
Ah it seems after 26 mins this appeared:
INFO 2024-08-08 16:26:38,293 instructlab.sdg:411: Generated 1 samples
INFO 2024-08-08 16:26:38,293 instructlab.sdg.pipeline:153: Running pipeline single-threaded
INFO 2024-08-08 16:26:38,294 instructlab.sdg.pipeline:197: Running block: gen_mmlu_knowledge
INFO 2024-08-08 16:26:38,294 instructlab.sdg.pipeline:198: Dataset({
features: ['icl_document', 'document', 'document_outline', 'domain', 'icl_query_1', 'icl_query_2', 'icl_query_3', 'icl_response_1', 'icl_response_2', 'icl_response_3'],
num_rows: 25
})
There needs to be some feedback saying it's running so people don't "crtl-c" out of it thinking it's broken when it first starts.
Could this be actually just using my "CPU" per instructlab/instructlab#2028 and not my GPU at all? Even though i have ran:
pip cache remove llama_cpp_python
pip install --force-reinstall llama_cpp_python==0.2.75 -C cmake.args="-DLLAMA_METAL=on
To make sure my llama_cpp_python
has the Apple Metal enabled?
@jjasghar, You can use asitop
(brew install asitop
) to confirm your GPU usage on your Mac.
Train profiles don't have any bearing on SDG, separate components
Well, it did finish, and it per @bjhargrave 's suggestion it looks like my GPU is being used.
INFO 2024-08-08 20:02:00,978 instructlab.sdg:438: Generation took 12617.11s
I would like to say it did take 3 hours, and the "yep I'm running" would have been nice to have feedback.
Something like a progress bar would be a good indicator - not sure if that change would be for the CLI or SDG lib (I assume the former)
Using ilab -v data generate
can provide more output from DEBUG logging just to let you know things are happening.
Describe the bug When I run
ilab data generate
there is no update or output like 0.17.1.This is after 20+ mins on a Mac M3, Activity monitor says "Python" is running, but I don't see anything.