Nerogar / OneTrainer

OneTrainer is a one-stop solution for all your stable diffusion training needs.
GNU Affero General Public License v3.0
1.67k stars 136 forks source link

[Bug] Masked Training cannot be turned on during training (automatic mask generation has been performed), and it can be executed normally if it is turned off. #322

Closed Zanedname closed 3 months ago

Zanedname commented 3 months ago

What happened?

Even if a mask is generated, enabling masked Training at training time will cause an error, and the Masked Tringing cannot be used for the time being

What did you expect would happen?

I can't train while Masked's on

Relevant log output

Traceback (most recent call last):
  File "D:\BOneTrainer\OneTrainer\modules\ui\TrainUI.py", line 518, in __training_thread_function
    trainer.train()
  File "D:\BOneTrainer\OneTrainer\modules\trainer\GenericTrainer.py", line 534, in train
    for epoch_step, batch in enumerate(step_tqdm):
  File "D:\BOneTrainer\OneTrainer\venv\lib\site-packages\tqdm\std.py", line 1182, in __iter__
    for obj in iterable:
  File "D:\BOneTrainer\OneTrainer\venv\lib\site-packages\torch\utils\data\dataloader.py", line 631, in __next__
    data = self._next_data()
  File "D:\BOneTrainer\OneTrainer\venv\lib\site-packages\torch\utils\data\dataloader.py", line 675, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "D:\BOneTrainer\OneTrainer\venv\lib\site-packages\torch\utils\data\_utils\fetch.py", line 32, in fetch
    data.append(next(self.dataset_iter))
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\LoadingPipeline.py", line 120, in __next__
    item = self.__output_module.get_next_item()
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\OutputPipelineModule.py", line 40, in get_next_item
    item[output_name] = self._get_previous_item(self.current_variation, input_name, self.current_index)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\PipelineModule.py", line 100, in _get_previous_item
    item = module.get_item(index, item_name)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\pipelineModules\AspectBatchSorting.py", line 90, in get_item
    item[name] = self._get_previous_item(self.current_variation, name, index)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\PipelineModule.py", line 96, in _get_previous_item
    item = module.get_item(variation, index, item_name)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\pipelineModules\SampleVAEDistribution.py", line 25, in get_item
    distribution = self._get_previous_item(variation, self.in_name, index)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\PipelineModule.py", line 124, in _get_previous_item
    item = module.get_item(index, item_name)
  File "d:\bonetrainer\onetrainer\venv\src\mgds\src\mgds\pipelineModules\DiskCache.py", line 246, in get_item
    item[name] = split_item[name]
KeyError: 'latent_mask'

Output of pip freeze

No response

Nerogar commented 3 months ago

You need to clear the cache when you turn masked training on

Zanedname commented 3 months ago

开启masked训练时需要清除缓存

Ok, I'll try that