Can't load the pre-trained model

fuhailin commented 5 years ago

I try to use the pre-trained model, but get the following error: => loading checkpoint '/home/fuhailin/runs/nondisjoint_l2norm/model_best.pth.tar' Traceback (most recent call last): File "main.py", line 312, in <module> main() File "main.py", line 138, in main checkpoint = torch.load(args.resume) File "/home/fuhailin/apps/anaconda3/envs/py36/lib/python3.6/site-packages/torch/serialization.py", line 387, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/home/fuhailin/apps/anaconda3/envs/py36/lib/python3.6/site-packages/torch/serialization.py", line 574, in _load result = unpickler.load() UnicodeDecodeError: 'ascii' codec can't decode byte 0xbe in position 2: ordinal not in range(128)

BryanPlummer commented 5 years ago

I would suspect this is a version issue. I would verify that you have installed the correct version of pytorch and are using python 2 (I can see you are using python 3 here).

chammika-become commented 5 years ago

I was able to reproduce results. Here is my conda evnironment.yaml file @fuhailin Create the conda env with: conda env create -f environment.yml

name: fashion-compat
channels:
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - blas=1.0=mkl
  - ca-certificates=2019.5.15=1
  - certifi=2019.6.16=py27_1
  - cffi=1.12.3=py27h2e261b9_0
  - cuda80=1.0=h205658b_0
  - cudatoolkit=8.0=3
  - cudnn=6.0.21=cuda8.0_0
  - freetype=2.9.1=h8a8886c_1
  - intel-openmp=2019.4=243
  - jpeg=9b=h024ee3a_2
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc=7.2.0=h69d50b8_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran=3.0.0=1
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.0.10=h2733197_2
  - mkl=2017.0.4=h4c4d0af_0
  - nccl=1.3.4=cuda8.0_1
  - ncurses=6.1=he6710b0_1
  - numpy=1.13.3=py27ha266831_3
  - olefile=0.46=py27_0
  - openssl=1.1.1c=h7b6447c_1
  - pillow=6.1.0=py27h34e0f95_0
  - pip=19.2.2=py27_0
  - pycparser=2.19=py27_0
  - python=2.7.16=h8b3fad2_4
  - pytorch=0.1.12=py27cuda8.0cudnn6.0_1
  - readline=7.0=h7b6447c_5
  - scikit-learn=0.18.2=np113py27_0
  - scipy=0.19.1=np113py27_0
  - setuptools=41.0.1=py27_0
  - six=1.12.0=py27_0
  - sqlite=3.29.0=h7b6447c_0
  - tk=8.6.8=hbc83047_0
  - torchvision=0.1.8=py27_0
  - wheel=0.33.4=py27_0
  - xz=5.2.4=h14c3975_4
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.3.7=h0b5b093_0
prefix: /home/chammika/.conda/envs/fashion-compat

chammika-become commented 5 years ago

Actually, I could only do the infrequence with the trained model. When training I get NaNs. @BryanPlummer could you please share the package/versions of the environment used to train the model with pip freeze or conda env export. Thank you.

Train Epoch: 1 [0/686851]   Loss: 0.3000 (0.3000)   Acc: 0.00% (0.00%)  Emb_Norm: 0.75 (0.75)
Train Epoch: 1 [64000/686851]   Loss: 0.0000 (0.0012)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [128000/686851]  Loss: 0.0000 (0.0006)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [192000/686851]  Loss: 0.0000 (0.0004)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [256000/686851]  Loss: 0.0000 (0.0003)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [320000/686851]  Loss: 0.0000 (0.0002)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [384000/686851]  Loss: 0.0000 (0.0002)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [448000/686851]  Loss: 0.0000 (0.0002)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [512000/686851]  Loss: 0.0000 (0.0001)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [576000/686851]  Loss: 0.0000 (0.0001)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)
Train Epoch: 1 [640000/686851]  Loss: 0.0000 (0.0001)   Acc: 0.00% (0.00%)  Emb_Norm: nan (nan)

BryanPlummer commented 5 years ago

Some packages are not required for this repo, but this should do it.

backports-abc==0.5 backports.functools-lru-cache==1.4 certifi==2017.7.27.1 cffi==1.10.0 chardet==3.0.4 conda==4.3.16 cycler==0.10.0 Cython==0.27 easydict==1.7 enum34==1.1.6 h5py==2.7.1 idna==2.5 matplotlib==2.1.1 mpmath==0.19 nltk==3.2.5 numpy==1.13.1 olefile==0.44 opencv-python==3.3.0.10 Pillow==4.2.1 pkg-resources==0.0.0 pycocotools==2.0 pycosat==0.6.1 pycparser==2.18 pyparsing==2.2.0 python-dateutil==2.6.1 pytz==2017.3 PyYAML==3.12 pyzmq==16.0.2 requests==2.18.3 ruamel.ordereddict==0.4.13 ruamel.yaml==0.15.34 scikit-learn==0.19.0 scipy==0.19.1 singledispatch==3.4.0.3 six==1.10.0 sklearn==0.0 subprocess32==3.2.7 sympy==1.1.1 torch==0.1.12.post2 torchvision==0.1.9 tornado==4.5.1 urllib3==1.22 visdom==0.1.5

marzooq-unbxd commented 1 year ago

Just use encoding='latin1' if ur using torch>0.4

mvasil / fashion-compatibility

Can't load the pre-trained model #11