Closed kcentric closed 1 month ago
Ok, so I resolved the compute compatibility error by adding dtype="float16"
when instantiating vLLM with distilabel.
Before this I went through a long journey trying to install vLLM from sources instead, etc. Looked at this, this, and this link.
Maybe we should add a clarification about this in our LLM docs: https://distilabel.argilla.io/latest/technical-reference/llms. I'll open a new issue proposing this; I believe anyone using vLLM with Google Colab would face this issue.
However, I am now facing another error: it appears that CUDA is running out of memory when loading vLLM. Currently exploring how to fix this.
Screenshot below:
@kcentric the general intro is looking quite good already. when you use a OOM error you might be able to get a monthly subscription to Google Colab pro which allows you to select higher memory GPUs.
I'm getting a Pydantic V2 incompatibility error when loading Haystack - it's claiming that Haystack is using a Pydantic version lower than V2. I searched on Haystack docs and saw there are some issues mentioning it but couldn't find a solution. I'm using the same Haystack import lines as @ignacioct, so I wonder if @ignacioct faced the same error or how to resolve it?
Error happens when I run from haystack.nodes import PDFToTextConverter, PreProcessor
.
Traceback starts here:
Ends with this:
I see that when the installation itself is running, I get this warning: "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. farm-haystack 1.24.0 requires pydantic<2, but you have pydantic 2.6.1 which is incompatible. farm-haystack 1.24.0 requires transformers==4.36.2, but you have transformers 4.37.2 which is incompatible."
Strangely enough, I get the opposite warning moments later when loading vLLM, that claims I have a lower version of Pydantic than I should. "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. vllm 0.3.0 requires pydantic>=2.0, but you have pydantic 1.10.14 which is incompatible. vllm 0.3.0 requires transformers>=4.37.0, but you have transformers 4.36.2 which is incompatible."
Totally bamboozling me. I'll post some more code snippets or screenshots if you need me to.
Update: Pydantic incomptability error has been resolved. Used Haystack beta. I've also added a cell that allows anyone to use Google Gemma if they don't have enough memory to run Notus, and Phi-2 in case they don't even have access to Gemma 🙂
instruction_model = "argilla/notus-7b" # Replace with "mistralai/Mistral-7b-Instruct-v0.1" etc., if desired.
# instruction_model = "google/gemma-2b" # Uncomment this line to use Gemma in case you are low on CUDA memory!
instruction_model = "microsoft/phi-2" # Uncomment this line to use Phi-2 in case you aren't authenticated to use even Gemma!
# Be ready for unexpected behaviours though, because Phi-2 isn't a chat-tuned model.
Create a notebook showing an end2end workflow with distilabel to create a preference dataset based on a ~200-page economic document (IMF World Economic Outlook, April 2023). The preference dataset could then be used to fine-tune a model to be an expert on the dataset.
https://colab.research.google.com/drive/1iPjuHhvMxe7LjDwZOzDJnXBWSpJ2ybhL?usp=sharing
Thought: Instead of just creating a preference dataset and ending the tutorial, I could also go ahead and demonstrate a fine-tuning of Phi-2 in this way.
Have added an intro with a good outline, and created the first couple code cells. Currently facing an issue instantiating Notus via vLLM: "Bfloat 16 is only supported on GPUs with 8.0< compute compatibility; your Tesla T4 has compute compatibility of 7.5".