lamini-ai / lamini

The Official Python Client for Lamini's API
https://lamini.ai/
Apache License 2.0
2.52k stars 151 forks source link

colab example train failed with no log in https://app.lamini.ai/train #24

Closed yangcheng closed 1 year ago

yangcheng commented 1 year ago

I try to follow the https://colab.research.google.com/drive/1QMeGzR9FnhNJJFmcHtm9RhFP3vrwIkFn?usp=sharing on readme

The train step

start=time.time()
finetune_model.train(enable_peft=True)
print(f"Time taken: {time.time()-start} seconds")

always failed with

Training job submitted! Check status of job 3459 here: https://app.lamini.ai/train
Job failed: {'job_id': 3459, 'status': 'FAILED', 'start_time': '2023-09-27T12:52:33.304092', 'model_name': None, 'custom_model_name': None, 'is_public': None}
Time taken: 35.00070023536682 seconds

I tried to specify different model names but it does not help. What make it harder is the log tab on https://app.lamini.ai/train is also empty.

Screenshot 2023-09-27 at 8 55 35 PM

how should I go from here? any suggestions are greatly appreciated!

edamamez commented 1 year ago

Hello!! Checking this out now 👀

Thank you for bringing this to our attention!

edamamez commented 1 year ago

Fixed!! Please try again 🙏 The logs should appear as well.

(We recently rotated some keys and missed a spot 😅 )

yangcheng commented 1 year ago

Just tried again, now the job is in queued state, fingers crossed

yangcheng commented 1 year ago

My last run finished successfully, with eval results and log in dashboard. Thanks for the quick turn around

GZDXGeorge commented 1 year ago

I had also met this kind of problem.Help.What I should do