bghira / SimpleTuner

A general fine-tuning kit geared toward diffusion models.
GNU Affero General Public License v3.0
1.78k stars 169 forks source link

UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value #1147

Open playerzer0x opened 1 day ago

playerzer0x commented 1 day ago

I'm training Flux on multi-gpu Runpod instances and receiving this UnboundLocalError during training start-up, and upon checkpoints saves. Training continues thankfully, but sharing in case it needs to be fixed:

Keyword arguments {'safety_checker': None} are not expected by FluxPipeline and will be ignored.
Loading pipeline components...: 100% 5/5 [00:00<00:00, 25.56it/s]
Loading pipeline components...:  20% 1/5 [00:00<00:00,  5Exception in thread Thread-263 (batch_write_embeddings):dation prompts:   0% 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-264 (batch_write_embeddings):rompts:   3% 1/30 [00:20<09:55, 20.55s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value
                                                                 Exception in thread Thread-265 (batch_write_embeddings):rompts:   7% 2/30 [00:41<09:35, 20.55s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-266 (batch_write_embeddings):rompts:  10% 3/30 [01:01<09:16, 20.63s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-267 (batch_write_embeddings):rompts:  13% 4/30 [01:22<08:57, 20.67s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-268 (batch_write_embeddings):rompts:  17% 5/30 [01:43<08:37, 20.70s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value
                                                                 Exception in thread Thread-269 (batch_write_embeddings):rompts:  20% 6/30 [02:03<08:14, 20.59s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value
                                                                 Exception in thread Thread-270 (batch_write_embeddings):rompts:  23% 7/30 [02:24<07:54, 20.65s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-271 (batch_write_embeddings):rompts:  27% 8/30 [02:45<07:35, 20.71s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                 Exception in thread Thread-272 (batch_write_embeddings):rompts:  30% 9/30 [03:05<07:14, 20.70s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-273 (batch_write_embeddings):ompts:  33% 10/30 [03:26<06:55, 20.77s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-274 (batch_write_embeddings):ompts:  37% 11/30 [03:47<06:35, 20.79s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-275 (batch_write_embeddings):ompts:  40% 12/30 [04:08<06:14, 20.83s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-276 (batch_write_embeddings):ompts:  43% 13/30 [04:29<05:54, 20.83s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-277 (batch_write_embeddings):ompts:  47% 14/30 [04:50<05:32, 20.81s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-278 (batch_write_embeddings):ompts:  50% 15/30 [05:10<05:11, 20.75s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-279 (batch_write_embeddings):ompts:  53% 16/30 [05:31<04:50, 20.76s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-280 (batch_write_embeddings):ompts:  57% 17/30 [05:52<04:30, 20.79s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-281 (batch_write_embeddings):ompts:  60% 18/30 [06:13<04:09, 20.77s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-282 (batch_write_embeddings):ompts:  63% 19/30 [06:33<03:48, 20.73s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-283 (batch_write_embeddings):ompts:  67% 20/30 [06:54<03:27, 20.72s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-284 (batch_write_embeddings):ompts:  70% 21/30 [07:15<03:06, 20.69s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-285 (batch_write_embeddings):ompts:  73% 22/30 [07:35<02:45, 20.71s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-286 (batch_write_embeddings):ompts:  77% 23/30 [07:56<02:24, 20.70s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-287 (batch_write_embeddings):ompts:  80% 24/30 [08:17<02:04, 20.71s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-288 (batch_write_embeddings):ompts:  83% 25/30 [08:38<01:43, 20.70s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-289 (batch_write_embeddings):ompts:  87% 26/30 [08:58<01:22, 20.61s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-290 (batch_write_embeddings):ompts:  90% 27/30 [09:18<01:01, 20.58s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-291 (batch_write_embeddings):ompts:  93% 28/30 [09:39<00:41, 20.61s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value   
                                                                  Exception in thread Thread-292 (batch_write_embeddings):ompts:  97% 29/30 [10:00<00:20, 20.61s/it]
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 222, in batch_write_embeddings
    first_item = self.write_queue.get(timeout=1)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 179, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 238, in batch_write_embeddings
    if len(batch) > 0:
           ^^^^^
UnboundLocalError: cannot access local variable 'batch' where it is not associated with a value
bghira commented 1 day ago

your system might be haunted. the respective code makes more sense than the error. let me show:

    def batch_write_embeddings(self):
        """Process write requests in batches."""
        batch = []

we're initialising batch here to an empty list.

        written_elements = 0
        while True:
            try:
                # Block until an item is available or timeout occurs
                first_item = self.write_queue.get(timeout=1)
                batch = [first_item]

here it is additionally not deleted but is set to a populated list of the first item.

                # Try to get more items without blocking
                while (
                    not self.write_queue.empty() and len(batch) < self.write_batch_size
                ):

this is the line where your error occurs ^

but batch is an empty list or populated with one item.

maybe later in the loop it del the batch:

                    logger.debug("Retrieving more items from the queue.")
                    items = self.write_queue.get_nowait()
                    batch.append(items)
                    logger.debug(f"Batch now contains {len(batch)} items.")

                self.process_write_batch(batch)
                self.write_thread_bar.update(len(batch))
                logger.debug("Processed batch write.")
                written_elements += len(batch)

            except queue.Empty:
                # Timeout occurred, no items were ready
                if not self.process_write_batches:
                    if len(batch) > 0:
                        self.process_write_batch(batch)
                        self.write_thread_bar.update(len(batch))
                    logger.debug(
                        f"Exiting batch write thread, no more work to do after writing {written_elements} elements"
                    )
                    break
                logger.debug(
                    f"Queue is empty. Retrieving new entries. Should retrieve? {self.process_write_batches}"
                )
                pass
            except Exception:
                logger.exception("An error occurred while writing embeddings to disk.")
        logger.debug("Exiting background batch write thread.")

nope. no deletions of the batch variable.

it might be a bad RunPod instance?

playerzer0x commented 1 day ago

it might be a bad RunPod instance?

Maybe? I'm sometimes re-using a volume across GPU instances in the same region. Training script eventually moves past these errors and starts training, but the errors come up again every time checkpoints save (takes several minutes). I'll see if I get the same error without using the volume.

playerzer0x commented 12 hours ago

Tried running an identical workload on AWS. Start-up gets halted here:

2024-11-13 16:10:54,377 [ERROR] Invalidating cache: error loading all_text_cache_files_text-embed-cache from disk. Expecting value: line 1 column 1 (char 0)
2024-11-13 16:10:54,377 [ERROR] Invalidating cache: error loading all_text_cache_files_text-embed-cache from disk. Expecting value: line 1 column 1 (char 0)
2024-11-13 16:10:54,380 [INFO] Pre-computing null embedding     
Write embeds to disk:   0%|                                     
                                       | 0/1 [00:00<?, ?it/s]   

Processing prompts:   0%|                                       
Processing prompts:   0%|                                       
                                       | 0/1 [00:17<?, ?it/s]   
bghira commented 11 hours ago

are you copying folders into the instances from another machine to run without having to do the caching again?

playerzer0x commented 11 hours ago

I'm cloning a fresh dataset repo from HuggingFace each time so it should be caching upon startup. Not resuming from an existing checkpoint either.

playerzer0x commented 7 hours ago

Same issue on a fresh Runpod instance in a different region/data center, similar workflow (this one I'm resuming a training). It managed to process the first couple datasets, but froze midway through:

2024-11-13 21:10:33,020 [INFO] (id=dct_desert_rally_racing_background-512) Collecting captions.
2024-11-13 21:10:33,069 [INFO] (id=dct_desert_rally_racing_background-512) Initialise text embed pre-computation using the textfile caption strategy. We have 25 captions to process.
2024-11-13 21:10:33,384 [INFO] (id=dct_desert_rally_racing_background-512) Completed processing 25 captions.                 
2024-11-13 21:10:33,384 [INFO] (id=dct_desert_rally_racing_background-512) Creating VAE latent cache.                        

Write embeds to disk:  33%|██████████████████████▋                                             | 1/3 [00:00<00:00,  4.52it/s]

Write embeds to disk:  33%|██████████████████████▋                                             | 1/3 [00:00<00:00,  3.87it/s]

Processing prompts:   0%|                                                                              | 0/3 [00:00<?, ?it/s]
Write embeds to disk:   0%|                                                                            | 0/4 [00:00<?, ?it/s]
Processing prompts:   0%|                                                                              | 0/3 [00:00<?, ?it/s]

Processing prompts:   0%|                                                                              | 0/4 [00:00<?, ?it/s]

Processing prompts:   0%|                                                                              | 0/3 [00:00<?, ?it/s]
Processing prompts:   0%|                                                                              | 0/3 [00:00<?, ?it/s]

Here's my config.json in case it's helpful:

{
  "--ignore_missing_files": "true",
  "--vae_cache_ondemand": "true",
  "--lycoris_config": "config/lycoris_config.json",
  "--resume_from_checkpoint": "latest",
  "--data_backend_config": "config/multidatabackend.json",
  "--aspect_bucket_rounding": 2,
  "--seed": 42,
  "--minimum_image_size": 0,
  "--disable_benchmark": false,
  "--output_dir": "ghxdct_style_focus_20241113_125207",
  "--lora_type": "lycoris",
  "--max_train_steps": 10000,
  "--num_train_epochs": 0,
  "--checkpointing_steps": 500,
  "--checkpoints_total_limit": 10,
  "--tracker_project_name": "ghxdct_style_focus",
  "--tracker_run_name": "ghxdct_style_focus_20241113_125207",
  "--report_to": "wandb",
  "--model_type": "lora",
  "--pretrained_model_name_or_path": "black-forest-labs/FLUX.1-dev",
  "--model_family": "flux",
  "--train_batch_size": 1,
  "--gradient_checkpointing": "true",
  "--caption_dropout_probability": 0.05,
  "--resolution_type": "pixel_area",
  "--resolution": 1024,
  "--validation_seed": 69,
  "--validation_steps": "500",
  "--validation_resolution": "1024x1024",
  "--validation_guidance": "3.5",
  "--validation_guidance_rescale": "0.0",
  "--validation_num_inference_steps": "20",
  "--validation_prompt": "a photo of a daisy",
  "--mixed_precision": "bf16",
  "--optimizer": "optimi-stableadamw",
  "--optimizer_config": "weight_decay=1e-3",
  "--learning_rate": "5e-06",
  "--flux_lora_target": "all+ffs",
  "--lr_scheduler": "polynomial",
  "--lr_warmup_steps": 100,
  "--user_prompt_library": "config/user_prompt_library.json",
  "--hub_model_id": "growwithdaisy/ghxdct_style_focus_20241113_125207",
  "--push_to_hub": "true",
  "--push_checkpoints_to_hub": "true",
  "--init_lora": "output/ghxdct_20241106_121319/pytorch_lora_weights.safetensors"
} 

And multidatabackend:

[
  {
    "id": "gh_logo-512",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/gh_logo",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 512,
    "target_downsample_size": 512,
    "resolution": 512,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//gh_logo-vae-512"
  },
  {
    "id": "gh_cans-512",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/gh_cans",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 512,
    "target_downsample_size": 512,
    "resolution": 512,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//gh_cans-vae-512"
  },
  {
    "id": "gh_cans-768",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/gh_cans",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 768,
    "target_downsample_size": 768,
    "resolution": 768,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//gh_cans-vae-768"
  },
  {
    "id": "gh_cans-1024",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/gh_cans",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 1024,
    "target_downsample_size": 1024,
    "resolution": 1024,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//gh_cans-vae-1024"
  },
  {
    "id": "dct_desert_rally_racing_background-512",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/dct_desert_rally_racing_background",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 512,
    "target_downsample_size": 512,
    "resolution": 512,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//dct_desert_rally_racing_background-vae-512"
  },
  {
    "id": "dct_desert_rally_racing_background-768",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/dct_desert_rally_racing_background",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 768,
    "target_downsample_size": 768,
    "resolution": 768,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//dct_desert_rally_racing_background-vae-768"
  },
  {
    "id": "dct_desert_rally_racing_background-1024",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/dct_desert_rally_racing_background",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 1024,
    "target_downsample_size": 1024,
    "resolution": 1024,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//dct_desert_rally_racing_background-vae-1024"
  },
  {
    "id": "anytylrjy_woman-512",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/anytylrjy_woman",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 512,
    "target_downsample_size": 512,
    "resolution": 512,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//anytylrjy_woman-vae-512"
  },
  {
    "id": "anytylrjy_woman-768",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/anytylrjy_woman",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 768,
    "target_downsample_size": 768,
    "resolution": 768,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//anytylrjy_woman-vae-768"
  },
  {
    "id": "anytylrjy_woman-1024",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/anytylrjy_woman",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 1024,
    "target_downsample_size": 1024,
    "resolution": 1024,
    "resolution_type": "pixel_area",
    "repeats": 0,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//anytylrjy_woman-vae-1024"
  },
  {
    "id": "mrtnprr_style-512",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/mrtnprr_style",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 512,
    "target_downsample_size": 512,
    "resolution": 512,
    "resolution_type": "pixel_area",
    "repeats": 1,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//mrtnprr_style-vae-512"
  },
  {
    "id": "mrtnprr_style-768",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/mrtnprr_style",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 768,
    "target_downsample_size": 768,
    "resolution": 768,
    "resolution_type": "pixel_area",
    "repeats": 1,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//mrtnprr_style-vae-768"
  },
  {
    "id": "mrtnprr_style-1024",
    "type": "local",
    "instance_data_dir": "datasets/ghxdct_style_focus/mrtnprr_style",
    "crop": false,
    "crop_style": "random",
    "minimum_image_size": 1024,
    "target_downsample_size": 1024,
    "resolution": 1024,
    "resolution_type": "pixel_area",
    "repeats": 1,
    "metadata_backend": "discovery",
    "caption_strategy": "textfile",
    "cache_dir_vae": "cache//mrtnprr_style-vae-1024"
  },
  {
    "id": "text-embed-cache",
    "dataset_type": "text_embeds",
    "default": true,
    "type": "local",
    "cache_dir": "cache//text",
    "disabled": false,
    "write_batch_size": 1
  }
]

Datasets are very small. <=25 images each, all PNG.

playerzer0x commented 6 hours ago

Switched to main (was formerly on release), and now training successfully proceeds. Could be good to update Flux quick start instructions if main is the correct branch to use.

playerzer0x commented 6 hours ago

Training above failed after 15 steps:


Epoch 1/113, Steps:   0%|                      | 15/10000 [00:52<10:02:57,  3.62s/it, lr=7.5e-7, mean_cfg=1, step_loss=0.521]2024-11-13 21:30:05,136 [ERROR] Failed to load corrupt torch file '/workspace/SimpleTuner/cache/text/10f34c853ae35b374b489315e55183a4-flux.pt': PytorchStreamReader failed reading zip archive: failed finding central directory
2024-11-13 21:30:05,138 [ERROR] Failed retrieving prompt from cache:
-> prompt: dct desert rally racing background, driving a Ducati bike, riding through water
-> filename: /workspace/SimpleTuner/cache/text/10f34c853ae35b374b489315e55183a4-flux.pt
-> error: PytorchStreamReader failed reading zip archive: failed finding central directory
-> id: text-embed-cache, data_backend id: text-embed-cache
Cache retrieval for text embed file failed. Ensure your dataloader config value for skip_file_discovery does not contain 'text', and that preserve_data_backend_cache is disabled or unset.
Traceback (most recent call last):
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 1110, in compute_embeddings_for_flux_prompts
    _flux_embed = self.load_from_cache(filename)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 277, in load_from_cache
    result = self.data_backend.torch_load(filename)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/data_backend/local.py", line 207, in torch_load
    raise e
  File "/workspace/SimpleTuner/helpers/data_backend/local.py", line 202, in torch_load
    loaded_tensor = torch.load(stored_tensor, map_location="cpu")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1072, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/.venv/lib/python3.11/site-packages/torch/serialization.py", line 480, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/SimpleTuner/train.py", line 49, in <module>
    trainer.train()
  File "/workspace/SimpleTuner/helpers/training/trainer.py", line 2136, in train
    batch = iterator_fn(step, *iterator_args)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/data_backend/factory.py", line 1377, in random_dataloader_iterator
    return next(chosen_iter)
           ^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 673, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/.venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/data_backend/factory.py", line 977, in <lambda>
    collate_fn=lambda examples: collate_fn(examples),
                                ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/training/collate.py", line 534, in collate_fn
    compute_prompt_embeddings(captions, text_embed_cache)
  File "/workspace/SimpleTuner/helpers/training/collate.py", line 290, in compute_prompt_embeddings
    embeddings = list(
                 ^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/SimpleTuner/helpers/training/collate.py", line 241, in compute_single_embedding
    text_embed_cache.compute_embeddings_for_flux_prompts(prompts=[caption])
  File "/workspace/SimpleTuner/helpers/caching/text_embeds.py", line 1134, in compute_embeddings_for_flux_prompts
    raise Exception(
Exception: Cache retrieval for text embed file failed. Ensure your dataloader config value for skip_file_discovery does not contain 'text', and that preserve_data_backend_cache is disabled or unset.
``