Finetuning with EDDL+ECVL achieves lower accuracy than PyTorch

MicheleCancilla commented 3 years ago

We are encountering problems at improving our UC12 classification pipeline performance.

We changed the code to load a ResNet model pretrained on ImageNet (the ONNX file is exported by PyTorch), then substitute the last layer and fine-tune the whole network on our data. This procedure runs fine with PyTorch (80% validation accuracy: resnet18 after 60 epochs, resnet50 after 20 epochs), while EDDL hangs at ~0.60 accuracy. I made two charts showing the evolution of the accuracy over time. Both pipelines (PyTorch and EDDL) use same model (the same ONNX file), augmentations, loss, metric, and hyperparameters.

We thought that something was wrong on input samples augmented by ECVL, but they look good and augmented as expected (we used this EDDL-based code to inspect them), so we think that there is an error somewhere else.

Could you please have a look at the code skin_lesion_classification_training_reproducible.cpp to find out what is wrong?

How to reproduce:

EDDL pipeline

prerequisites:

cmake >= 3.17.1
gcc>=6 or clang>=5
ISIC dataset - download instructions

git clone https://github.com/deephealthproject/use-case-pipelines.git
cd use-case-pipelines
# Locally install EDDL, OpenCV, ECVL and build the pipeline
chmod u+x build_pipeline.sh
./build_pipeline.sh
cd bin_lin
make

./SKIN_LESION_CLASSIFICATION_TRAINING_REPR -d /path/to/isic_classification.yml -c "resnet18.onnx" --gpu 1 --epochs 150 --batch_size 50 --learning_rate 1e-5

PyTorch pipeline

git clone https://github.com/deephealthproject/use-case-pipelines.git
cd use-case-pipelines/pytorch/skin_lesion_classification
pip3 install -r requirements.txt

python3 main.py "/path/to/isic_classification.yml" --model resnet18 --gpu 1 --batch_size 50 --learning_rate 1e-5 --onnx-export

Comparison

Test made with EDDL 0.9.1b and ECVL 0.3.4. eddl_pt_comparison.xlsx

EDDL	PyTorch

salvacarrion commented 3 years ago

Thank you! We'll check it soon

chavicoski commented 3 years ago

Hello, we recently fixed a bug in the batchnorm inference mode that may be causing this loss in the accuracy. The fix is not in the master branch yet, but if you can, use the develop branch and repeat the experiment to see if it still happens.

MicheleCancilla commented 3 years ago

That's good to know! @chavicoski Do the ONNX models that come from PyTorch need to be "simplified" with tools like onnx-simplifier? I didn't do that and EDDL loaded them with no errors/warnings. I exported the ONNX this way:

https://github.com/deephealthproject/use-case-pipelines/blob/0a4574248c8956bc215da85faef12f1b80594964/pytorch/skin_lesion_classification/main.py#L103-L112

dummy_input = torch.ones(4, 3, args.size, args.size, device='cpu')
model.train()
torch.onnx.export(model, dummy_input, f'{args.model}.onnx', verbose=True, export_params=True,
                  training=torch.onnx.TrainingMode.TRAINING,
                  opset_version=12,
                  do_constant_folding=False,
                  input_names=['input'],
                  output_names=['output'],
                  dynamic_axes={'input': {0: 'batch_size'},  # variable length axes
                                'output': {0: 'batch_size'}})

chavicoski commented 3 years ago

No, if the eddl loads the model you don't need to apply the onnx simplifier. In case that the load funcion of eddl fails, you can try to apply the onnx simplifier to avoid some operators that we don't support. But as I said, for this case you don't need it.

MicheleCancilla commented 3 years ago

I repeated tests with EDDL develop (457a71d).

EDDL	PyTorch

Results are good as expected (I did not use cuDNN), there are perhaps more spikes compared to PyTorch. Thanks for the tips😉

deephealthproject / eddl