ChristianBergler / ANIMAL-SPOT

An Animal Independent Deep Learning Framework for Bioacoustic Signal Segmentation and Classification Including a Detailed User-Guide
GNU General Public License v3.0
36 stars 8 forks source link

Training the algo: "cannot pickle '_thread.lock' object" #10

Closed liofeu closed 3 months ago

liofeu commented 3 months ago

Hello,

I'm testing the training with the example dataset and I get the following error: TypeError: cannot pickle '_thread.lock' object It occurs on a laptop running macOS 10.15.7 / Python 3.11.9 but not on a PC running Window / Python 3.12.

Note: I'm testing the code without GPU for the moment.

Here is the terminal ouputs:

Lionels-MacBook-Pro:TRAINING lionel$ python3 start_training.py config
2024-06-11 18:12:40,105 - training animal-spot - INFO - Config Data: {'src_dir': '"/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"', 'debug': '', 'data_dir': '"/Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET"', 'cache_dir': '"/Users/lionel/database/cache_dir"', 'model_dir': '"/Users/lionel/database/model_dir"', 'checkpoint_dir': '"/Users/lionel/database/checkpoint_dir"', 'log_dir': '"/Users/lionel/database/log_dir"', 'summary_dir': '"/Users/lionel/database/summary_dir"', 'noise_dir': '"/Users/lionel/database/noise_dir"', 'start_from_scratch': '', 'max_train_epochs': '1', 'epochs_per_eval': '2', 'batch_size': '16', 'num_workers': '8', 'lr': '10e-5', 'beta1': '0.5', 'lr_patience_epochs': '8', 'lr_decay_factor': '0.5', 'early_stopping_patience_epochs': '20', 'filter_broken_audio': '', 'sequence_len': '500', 'freq_compression': 'linear', 'n_freq_bins': '256', 'n_fft': '1024', 'hop_length': '172', 'sr': '44100', 'augmentation': '', 'resnet': '18', 'conv_kernel_size': '7', 'num_classes': '2', 'max_pool': '2', 'min_max_norm': '', 'fmin': '500', 'fmax': '10000'}
2024-06-11 18:12:40,105 - training animal-spot - INFO - Training Command: python3 -W ignore::UserWarning "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"/main.py --debug --data_dir "/Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET" --cache_dir "/Users/lionel/database/cache_dir" --model_dir "/Users/lionel/database/model_dir" --checkpoint_dir "/Users/lionel/database/checkpoint_dir" --log_dir "/Users/lionel/database/log_dir" --summary_dir "/Users/lionel/database/summary_dir" --noise_dir "/Users/lionel/database/noise_dir" --start_from_scratch --max_train_epochs 1 --epochs_per_eval 2 --batch_size 16 --num_workers 8 --lr 10e-5 --beta1 0.5 --lr_patience_epochs 8 --lr_decay_factor 0.5 --early_stopping_patience_epochs 20 --filter_broken_audio --sequence_len 500 --freq_compression linear --n_freq_bins 256 --n_fft 1024 --hop_length 172 --sr 44100 --augmentation --resnet 18 --conv_kernel_size 7 --num_classes 2 --max_pool 2 --min_max_norm --fmin 500 --fmax 10000
2024-06-11 18:12:40,105 - training animal-spot - INFO - Start Training!!!
18:12:43|D|dataOpts: {
    "sr": 44100,
    "preemphases": 0.98,
    "n_fft": 1024,
    "hop_length": 172,
    "n_freq_bins": 256,
    "fmin": 500,
    "fmax": 10000,
    "freq_compression": "linear",
    "min_level_db": -100,
    "ref_level_db": 20
}
18:12:43|D|Number of spectrogram time-steps (input size = time-steps x frequency-bins) : 128
18:12:43|I|Setting up model
18:12:43|D|Encoder: ResidualEncoder(
  (conv1): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu1): ReLU(inplace=True)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
)
18:12:43|D|Classifier: Classifier(
  (linear): Linear(in_features=512, out_features=2, bias=True)
)
18:12:43|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET/train
18:12:43|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET/val
18:12:43|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET/test
18:12:43|I|Found csv files in /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_MONK-PARAKEET_TARGET-NOISE-SEGMENTATION-DATASET
Traceback (most recent call last):
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/main.py", line 430, in <module>
    raise Exception("amount of automatically identified classes do not match amount of chosen classes!")
Exception: amount of automatically identified classes do not match amount of chosen classes!
Lionels-MacBook-Pro:TRAINING lionel$ python3 start_training.py config
2024-06-12 10:40:59,647 - training animal-spot - INFO - Config Data: {'src_dir': '"/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"', 'debug': '', 'data_dir': '"/Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS"', 'cache_dir': '"/Users/lionel/database/cache_dir"', 'model_dir': '"/Users/lionel/database/model_dir"', 'checkpoint_dir': '"/Users/lionel/database/checkpoint_dir"', 'log_dir': '"/Users/lionel/database/log_dir"', 'summary_dir': '"/Users/lionel/database/summary_dir"', 'noise_dir': '"/Users/lionel/database/noise_dir"', 'start_from_scratch': '', 'max_train_epochs': '1', 'epochs_per_eval': '2', 'batch_size': '16', 'num_workers': '8', 'lr': '10e-5', 'beta1': '0.5', 'lr_patience_epochs': '8', 'lr_decay_factor': '0.5', 'early_stopping_patience_epochs': '20', 'filter_broken_audio': '', 'sequence_len': '500', 'freq_compression': 'linear', 'n_freq_bins': '256', 'n_fft': '1024', 'hop_length': '172', 'sr': '44100', 'augmentation': '', 'resnet': '18', 'conv_kernel_size': '7', 'num_classes': '2', 'max_pool': '2', 'min_max_norm': '', 'fmin': '500', 'fmax': '10000'}
2024-06-12 10:40:59,647 - training animal-spot - INFO - Training Command: python3 -W ignore::UserWarning "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"/main.py --debug --data_dir "/Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS" --cache_dir "/Users/lionel/database/cache_dir" --model_dir "/Users/lionel/database/model_dir" --checkpoint_dir "/Users/lionel/database/checkpoint_dir" --log_dir "/Users/lionel/database/log_dir" --summary_dir "/Users/lionel/database/summary_dir" --noise_dir "/Users/lionel/database/noise_dir" --start_from_scratch --max_train_epochs 1 --epochs_per_eval 2 --batch_size 16 --num_workers 8 --lr 10e-5 --beta1 0.5 --lr_patience_epochs 8 --lr_decay_factor 0.5 --early_stopping_patience_epochs 20 --filter_broken_audio --sequence_len 500 --freq_compression linear --n_freq_bins 256 --n_fft 1024 --hop_length 172 --sr 44100 --augmentation --resnet 18 --conv_kernel_size 7 --num_classes 2 --max_pool 2 --min_max_norm --fmin 500 --fmax 10000
2024-06-12 10:40:59,647 - training animal-spot - INFO - Start Training!!!
10:41:32|D|dataOpts: {
    "sr": 44100,
    "preemphases": 0.98,
    "n_fft": 1024,
    "hop_length": 172,
    "n_freq_bins": 256,
    "fmin": 500,
    "fmax": 10000,
    "freq_compression": "linear",
    "min_level_db": -100,
    "ref_level_db": 20
}
10:41:32|D|Number of spectrogram time-steps (input size = time-steps x frequency-bins) : 128
10:41:32|I|Setting up model
10:41:32|D|Encoder: ResidualEncoder(
  (conv1): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu1): ReLU(inplace=True)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (shortcut): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu1): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu2): ReLU(inplace=True)
    )
  )
)
10:41:32|D|Classifier: Classifier(
  (linear): Linear(in_features=512, out_features=2, bias=True)
)
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/train
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/val
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/test
10:41:32|I|Found csv files in /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS
10:41:32|I|Model predict 2 classes
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/train
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/val
10:41:32|D|Found dataset split file /Users/lionel/database/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS/test
10:41:32|I|Init dataset train...
10:41:32|D|Number of files : 517
10:41:32|D|Number of samples in train for target: 256
10:41:32|D|Number of samples in train for noise: 261
10:41:35|D|Init augmentation transforms for time and pitch shift
10:41:35|D|No noise augmentation
10:41:35|D|Init min-max-normalization activated
10:41:35|I|Init dataset val...
10:41:35|D|Number of files : 132
10:41:35|D|Number of samples in val for target: 90
10:41:35|D|Number of samples in val for noise: 42
10:41:38|D|Running without augmentation
10:41:38|D|Init min-max-normalization activated
10:41:38|I|Init dataset test...
10:41:38|D|Number of files : 221
10:41:38|D|Number of samples in test for target: 144
10:41:38|D|Number of samples in test for noise: 77
10:41:41|D|Running without augmentation
10:41:41|D|Init min-max-normalization activated
10:41:41|I|Init summary writer
10:41:41|D|Starting checkpoint writer thread
10:41:41|I|Init model on device 'cpu'
10:41:41|I|Class Distribution: {'noise': 0, 'target': 1}
10:41:41|I|Start training model binary_classifier
10:41:41|D|train|0|start
10:41:41|W|Traceback (most recent call last):
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/trainer.py", line 158, in fit
    self.train_epoch(
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/trainer.py", line 247, in train_epoch
    for i, (features, label) in enumerate(train_loader):
                                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 439, in __iter__
    return self._get_iterator()
           ^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 387, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1040, in __init__
    w.start()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_thread.lock' object

10:41:41|W|Aborting...
ChristianBergler commented 3 months ago

Hello, seems like it has something to do with "multiprocessing" because of the "thread lock". The system was originally designed on Linux. Especially with windows, the multiprocessing functionality during data preprocessing & preparation causes problems. You can try to set the command line parameter --num_workers = 0 ... That should actually disable multithreading and thus avoid the "_thread.lock" error.

liofeu commented 3 months ago

It solved the issue. Thank you!