deepset-ai / haystack

:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.23k stars 1.89k forks source link

Bug in `FARMReader.distil_prediction_layer_from` #5512

Closed anakin87 closed 10 months ago

anakin87 commented 1 year ago

Discussed in https://github.com/deepset-ai/haystack/discussions/5508

Originally posted by **zh0613** August 4, 2023 When i run this code teacher = FARMReader(model_name_or_path="my_model", use_gpu=True) student = FARMReader(model_name_or_path="huawei-noah/TinyBERT_General_6L_768D", use_gpu=True) student.distil_intermediate_layers_from(teacher, data_dir=".", train_filename="augmented_dataset.json", use_gpu=True) student.distil_prediction_layer_from(teacher, data_dir="data/squad20", train_filename="dev-v2.0.json", use_gpu=True) student.save(directory="my_distilled_model")` This error pop out --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) [](https://localhost:8080/#) in () 7 8 student.distil_intermediate_layers_from(teacher, data_dir=".", train_filename="augmented_dataset.json", use_gpu=True) ----> 9 student.distil_prediction_layer_from(teacher, data_dir="data/squad20", train_filename="dev-v2.0.json", use_gpu=True) 10 11 student.save(directory="my_distilled_model") 7 frames [/usr/local/lib/python3.10/dist-packages/haystack/modeling/data_handler/data_silo.py](https://localhost:8080/#) in (.0) 784 with torch.inference_mode(): 785 batch_transposed = zip(*batch) # transpose dimensions (from batch, features, ... to features, batch, ...) --> 786 batch_transposed_list = [torch.stack(b) for b in batch_transposed] # create tensors for each feature 787 batch_dict = { 788 key: tensor.to(self.device) for key, tensor in zip(tensor_names, batch_transposed_list) RuntimeError: stack expects each tensor to be equal size, but got [5, 2] at entry 0 and [6, 2] at entry 2`
Timoeller commented 1 year ago

Might require substantial effort to fix. I prefer to

  1. remove distillation from our tutorial 2 @bilgeyucel @TuanaCelik could you add this task to your sprint please?
  2. wait for contributors to fix it
  3. work on it after the haystack 2.0 release.