Chiaraplizz / ST-TR

Spatial Temporal Transformer Network for Skeleton-Based Activity Recognition
MIT License
294 stars 57 forks source link

Problem using only joints for NTURGB #14

Closed bszczapa closed 3 years ago

bszczapa commented 3 years ago

Hi,

I am trying to make the code working on my pc and to do so, I have generated the data from the NTURGB dataset using the preprocess.py script. When I try to run the main.py script, I get the following error:

Traceback (most recent call last):
  File "main.py", line 1015, in <module>
    processor.start()
  File "main.py", line 926, in start
    self.train(epoch, save_model=save_model)
  File "main.py", line 581, in train
    for batch_idx, (data, label, name) in enumerate(loader):
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 355, in __iter__
    return self._get_iterator()
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 301, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 914, in __init__
    w.start()
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
_pickle.UnpicklingError: pickle data was truncated

From what I understand, the preprocess.py script only generates the joints data. So my question is do I also need to generate the bones data and merge them with the joints or can I use only the joints data to train the network ? Also, the error I get make me think that not all the data are loaded correctly and I am thinking of testing the code on another pc with more memory. What are the specs of the machine on which you train the network or have you encounter this error before ?

Thanks in advance.

Best regards. Benjamin

Chiaraplizz commented 3 years ago

Hi,

I am trying to make the code working on my pc and to do so, I have generated the data from the NTURGB dataset using the preprocess.py script. When I try to run the main.py script, I get the following error:

Traceback (most recent call last):
  File "main.py", line 1015, in <module>
    processor.start()
  File "main.py", line 926, in start
    self.train(epoch, save_model=save_model)
  File "main.py", line 581, in train
    for batch_idx, (data, label, name) in enumerate(loader):
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 355, in __iter__
    return self._get_iterator()
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 301, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\site-packages\torch\utils\data\dataloader.py", line 914, in __init__
    w.start()
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\Benjamin\Anaconda3\envs\py38\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
_pickle.UnpicklingError: pickle data was truncated

From what I understand, the preprocess.py script only generates the joints data. So my question is do I also need to generate the bones data and merge them with the joints or can I use only the joints data to train the network ? Also, the error I get make me think that not all the data are loaded correctly and I am thinking of testing the code on another pc with more memory. What are the specs of the machine on which you train the network or have you encounter this error before ?

Thanks in advance.

Best regards. Benjamin

Hi Benjamin,

You can also use joint information only, I reported in the paper both the results obtained with joints only and the ones with also bones. Remember that in the config.yml you should set channel:3 and double_channel: False for joint information only.

I trained the model on 2 GeForce GTX 1080 of 12G each.

I've never encountered this error on loading. It seems it can't dowload data correctly. Are you running on GPU, right?

Chiara

bszczapa commented 3 years ago

Hi Chiara,

I solved my problem by changing the pc on which I wanted to train the network and it worked. I think this is a memory issue, when you don't have enough memory avaiable. Yes I am training the model on a RTX 2080ti with 11G.

Thank you.

Benjamin