victoresque / pytorch-template

PyTorch deep learning projects made easy.
MIT License
4.7k stars 1.08k forks source link

Some thing wrong with add_histogram function #96

Open ngocgiang99 opened 3 years ago

ngocgiang99 commented 3 years ago

I had a strange problem when I tried to implement EfficientNet model. Last week, this code worked fine. But now, when I retrain then occurs some error. I faced this error in the validation step.

This is my code: https://github.com/ngocgiang99/Paper-Implementation. Please checkout to branch feat_efficient_net.

This is error log:

Traceback (most recent call last):
  File "train.py", line 73, in <module>
    main(config)
  File "train.py", line 54, in main
    trainer.train()
  File "D:\Work\Me\Paper-Implementation\efficient_net\base\base_trainer.py", l
    result = self._train_epoch(epoch)
  File "D:\Work\Me\Paper-Implementation\efficient_net\trainer\trainer.py", lin
    val_log = self._valid_epoch(epoch)
  File "D:\Work\Me\Paper-Implementation\efficient_net\trainer\trainer.py", lin
    self.writer.add_histogram(name, p, bins='auto')
  File "D:\Work\Me\Paper-Implementation\efficient_net\logger\visualization.py"
    add_data(tag, data, self.step, *args, **kwargs)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tenso
in add_histogram
    histogram(tag, values, bins, max_bins=max_bins), global_step, walltime)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tenso in histogram
    hist = make_histogram(values.astype(float), bins, max_bins)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tenso in make_histogram
    counts, limits = np.histogram(values, bins=bins)
  File "<__array_function__ internals>", line 6, in histogram
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\lib\histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\lib\histogrn_edges
    endpoint=True, dtype=bin_type)
  File "<__array_function__ internals>", line 6, in linspace
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\core\function_base.py", line 135, in linspace
    y = _nx.arange(0, num, dtype=dt).reshape((-1,) + (1,) * ndim(delta))
numpy.core._exceptions.MemoryError: Unable to allocate 10.3 TiB for an array with shape (1418558411252,) and data type float64

Conda environment:

# Name                    Version                   Build  Channel
absl-py                   0.12.0                   pypi_0    pypi
ca-certificates           2021.4.13            haa95532_1
cachetools                4.2.2                    pypi_0    pypi
certifi                   2020.12.5        py37haa95532_0
chardet                   4.0.0                    pypi_0    pypi
google-auth               1.30.1                   pypi_0    pypi
google-auth-oauthlib      0.4.4                    pypi_0    pypi
grpcio                    1.38.0                   pypi_0    pypi
idna                      2.10                     pypi_0    pypi
importlib-metadata        4.3.1                    pypi_0    pypi
markdown                  3.3.4                    pypi_0    pypi
numpy                     1.20.3                   pypi_0    pypi
oauthlib                  3.1.0                    pypi_0    pypi
openssl                   1.1.1k               h2bbff1b_0
pandas                    1.2.4                    pypi_0    pypi
pillow                    8.2.0                    pypi_0    pypi
pip                       21.1.1           py37haa95532_0
protobuf                  3.17.1                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
python                    3.7.10               h6244533_0
python-dateutil           2.8.1                    pypi_0    pypi
pytz                      2021.1                   pypi_0    pypi
requests                  2.25.1                   pypi_0    pypi
requests-oauthlib         1.3.0                    pypi_0    pypi
rsa                       4.7.2                    pypi_0    pypi
setuptools                52.0.0           py37haa95532_0
six                       1.16.0                   pypi_0    pypi
sqlite                    3.35.4               h2bbff1b_0
tensorboard               2.5.0                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.0                    pypi_0    pypi
torch                     1.8.1+cu111              pypi_0    pypi
torchaudio                0.8.1                    pypi_0    pypi
torchvision               0.9.1+cu111              pypi_0    pypi
tqdm                      4.61.0                   pypi_0    pypi
typing-extensions         3.10.0.0                 pypi_0    pypi
urllib3                   1.26.5                   pypi_0    pypi
vc                        14.2                 h21ff451_1
vs2015_runtime            14.27.29016          h5e58377_2
werkzeug                  2.0.1                    pypi_0    pypi
wheel                     0.36.2             pyhd3eb1b0_0
wincertstore              0.2                      py37_0
zipp                      3.4.1                    pypi_0    

Thanks.

MohamedA95 commented 3 years ago

Hi, I am not an expert but just out of curiosity is normal for it to try to allocate 10.3 TiB???

ngocgiang99 commented 3 years ago

Hi @MohamedA95 , I printed the shape of p variable. The shape is small but I don't know why it allocate very large shape. Here is log when I printed shape of p:

histogram shape:  torch.Size([32, 3, 3, 3])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32, 32, 1, 1])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32, 1, 3, 3])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([32])
histogram shape:  torch.Size([16, 32, 1, 1])
histogram shape:  torch.Size([16])
histogram shape:  torch.Size([16])
histogram shape:  torch.Size([16])
histogram shape:  torch.Size([96, 16, 1, 1])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([96, 1, 3, 3])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([96])
histogram shape:  torch.Size([24, 96, 1, 1])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([144, 24, 1, 1])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144, 1, 3, 3])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([24, 144, 1, 1])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([24])
histogram shape:  torch.Size([144, 24, 1, 1])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144, 1, 5, 5])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([144])
histogram shape:  torch.Size([40, 144, 1, 1])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([240, 40, 1, 1])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240, 1, 5, 5])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([40, 240, 1, 1])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([40])
histogram shape:  torch.Size([240, 40, 1, 1])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240, 1, 3, 3])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([240])
histogram shape:  torch.Size([80, 240, 1, 1])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([480, 80, 1, 1])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480, 1, 3, 3])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([80, 480, 1, 1])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([480, 80, 1, 1])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480, 1, 3, 3])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([80, 480, 1, 1])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([80])
histogram shape:  torch.Size([480, 80, 1, 1])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480, 1, 5, 5])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([480])
histogram shape:  torch.Size([112, 480, 1, 1])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([672, 112, 1, 1])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672, 1, 5, 5])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([112, 672, 1, 1])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([672, 112, 1, 1])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672, 1, 5, 5])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([112, 672, 1, 1])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([112])
histogram shape:  torch.Size([672, 112, 1, 1])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672, 1, 5, 5])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([672])
histogram shape:  torch.Size([192, 672, 1, 1])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([1152, 192, 1, 1])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152, 1, 5, 5])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([192, 1152, 1, 1])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([1152, 192, 1, 1])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152, 1, 5, 5])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([192, 1152, 1, 1])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([1152, 192, 1, 1])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152, 1, 5, 5])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([1152])
histogram shape:  torch.Size([192, 1152, 1, 1])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([192])
histogram shape:  torch.Size([1152, 192, 1, 1])
histogram shape:  torch.Size([1152])
Traceback (most recent call last):
  File "train.py", line 73, in <module>
    main(config)
  File "train.py", line 54, in main
    trainer.train()
  File "D:\Work\Me\Paper-Implementation\efficient_net\base\base_trainer.py", line 63, in train
    result = self._train_epoch(epoch)
  File "D:\Work\Me\Paper-Implementation\efficient_net\trainer\trainer.py", line 68, in _train_epoch
    val_log = self._valid_epoch(epoch)
  File "D:\Work\Me\Paper-Implementation\efficient_net\trainer\trainer.py", line 100, in _valid_epoch    self.writer.add_histogram(name, p, bins='auto')
  File "D:\Work\Me\Paper-Implementation\efficient_net\logger\visualization.py", line 65, in wrapper
    add_data(tag, data, self.step, *args, **kwargs)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tensorboard\writer.py", line 429, in add_histogram
    histogram(tag, values, bins, max_bins=max_bins), global_step, walltime)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tensorboard\summary.py", line 300, in histogram
    hist = make_histogram(values.astype(float), bins, max_bins)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\torch\utils\tensorboard\summary.py", line 309, in make_histogram
    counts, limits = np.histogram(values, bins=bins)
  File "<__array_function__ internals>", line 6, in histogram
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\lib\histograms.py", line 792, in 
histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\lib\histograms.py", line 448, in 
_get_bin_edges
    endpoint=True, dtype=bin_type)
  File "<__array_function__ internals>", line 6, in linspace
  File "C:\Users\PC\anaconda3\envs\general\lib\site-packages\numpy\core\function_base.py", line 135, in linspace
    y = _nx.arange(0, num, dtype=dt).reshape((-1,) + (1,) * ndim(delta))
numpy.core._exceptions.MemoryError: Unable to allocate 10.3 TiB for an array with shape (1418558411252,) and data type float64