JesseBrouw / FoMo-LoRA

MIT License
1 stars 1 forks source link

MVP of the dynamic lora allocation framework #3

Closed MJHamar closed 6 months ago

MJHamar commented 7 months ago

Have a look, let's have a discussion!

MJHamar commented 6 months ago

self.scheduler.step() is not called because by default, PeftModel.get_base_model returns the backbone model instance (with the LoRAs attached, but without the LoraModel wrapping it). As a result, DynaLoraModel.forward() is never called.

We can fix this by overriding get_base_model in PeftModelWrapper, but it results in all kinds of different errors, because the target model's forward signature is now lost. I'm in the process of fixing it, but may not be able to do it today.

MJHamar commented 6 months ago

Working with the issue that the scheduler is never called.

MJHamar commented 6 months ago

full error

  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/src/train.py", line 191, in <module>
    main()
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/src/train.py", line 185, in main
    trainer.train()
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 1859, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/accelerate/data_loader.py", line 452, in __iter__
    current_batch = next(dataloader_iter)
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = self.dataset.__getitems__(possibly_batched_index)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 2865, in __getitems__
    batch = self.__getitem__(keys)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 2861, in __getitem__
    return self._getitem(key)
           ^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 2845, in _getitem
    pa_subtable = query_table(self._data, key, indices=self._indices)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/formatting/formatting.py", line 587, in query_table
    _check_valid_index_key(key, size)
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/formatting/formatting.py", line 537, in _check_valid_index_key
    _check_valid_index_key(int(max(key)), size=size)
  File "/Users/hamarmiklos/Desktop/Egyetem/uva/FoMo-LoRA/.venv/lib/python3.11/site-packages/datasets/formatting/formatting.py", line 527, in _check_valid_index_key
    raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")
IndexError: Invalid key: 8431 is out of bounds for size 0