argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
https://distilabel.argilla.io
Apache License 2.0
1.4k stars 95 forks source link

[DOCS] tutorial on using Notus and Llama2 to generate finance preference dataset #325

Closed kcentric closed 1 month ago

kcentric commented 7 months ago

Create a notebook showing an end2end workflow with distilabel to create a preference dataset based on a ~200-page economic document (IMF World Economic Outlook, April 2023). The preference dataset could then be used to fine-tune a model to be an expert on the dataset.

https://colab.research.google.com/drive/1iPjuHhvMxe7LjDwZOzDJnXBWSpJ2ybhL?usp=sharing

Thought: Instead of just creating a preference dataset and ending the tutorial, I could also go ahead and demonstrate a fine-tuning of Phi-2 in this way.

Have added an intro with a good outline, and created the first couple code cells. Currently facing an issue instantiating Notus via vLLM: "Bfloat 16 is only supported on GPUs with 8.0< compute compatibility; your Tesla T4 has compute compatibility of 7.5".

kcentric commented 7 months ago

Ok, so I resolved the compute compatibility error by adding dtype="float16" when instantiating vLLM with distilabel.

Before this I went through a long journey trying to install vLLM from sources instead, etc. Looked at this, this, and this link.

Maybe we should add a clarification about this in our LLM docs: https://distilabel.argilla.io/latest/technical-reference/llms. I'll open a new issue proposing this; I believe anyone using vLLM with Google Colab would face this issue.

However, I am now facing another error: it appears that CUDA is running out of memory when loading vLLM. Currently exploring how to fix this.

Screenshot below:

Screenshot 2024-02-04 at 11 00 50 PM
davidberenstein1957 commented 7 months ago

@kcentric the general intro is looking quite good already. when you use a OOM error you might be able to get a monthly subscription to Google Colab pro which allows you to select higher memory GPUs.

kcentric commented 7 months ago

I'm getting a Pydantic V2 incompatibility error when loading Haystack - it's claiming that Haystack is using a Pydantic version lower than V2. I searched on Haystack docs and saw there are some issues mentioning it but couldn't find a solution. I'm using the same Haystack import lines as @ignacioct, so I wonder if @ignacioct faced the same error or how to resolve it?

Error happens when I run from haystack.nodes import PDFToTextConverter, PreProcessor.

Traceback starts here:

Screenshot 2024-02-05 at 11 16 16 PM

Ends with this:

Screenshot 2024-02-05 at 11 18 00 PM

I see that when the installation itself is running, I get this warning: "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. farm-haystack 1.24.0 requires pydantic<2, but you have pydantic 2.6.1 which is incompatible. farm-haystack 1.24.0 requires transformers==4.36.2, but you have transformers 4.37.2 which is incompatible."

Strangely enough, I get the opposite warning moments later when loading vLLM, that claims I have a lower version of Pydantic than I should. "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. vllm 0.3.0 requires pydantic>=2.0, but you have pydantic 1.10.14 which is incompatible. vllm 0.3.0 requires transformers>=4.37.0, but you have transformers 4.36.2 which is incompatible."

Totally bamboozling me. I'll post some more code snippets or screenshots if you need me to.

kcentric commented 6 months ago

Update: Pydantic incomptability error has been resolved. Used Haystack beta. I've also added a cell that allows anyone to use Google Gemma if they don't have enough memory to run Notus, and Phi-2 in case they don't even have access to Gemma 🙂

instruction_model = "argilla/notus-7b"  # Replace with "mistralai/Mistral-7b-Instruct-v0.1" etc., if desired.
# instruction_model = "google/gemma-2b"  # Uncomment this line to use Gemma in case you are low on CUDA memory!
instruction_model = "microsoft/phi-2"  # Uncomment this line to use Phi-2 in case you aren't authenticated to use even Gemma!
                                        # Be ready for unexpected behaviours though, because Phi-2 isn't a chat-tuned model.