LucasKirsten / MobileMEF

Code related to the paper "MobileMEF: Fast and Efficient Method for Multi-Exposure Fusion"
GNU Affero General Public License v3.0
5 stars 1 forks source link

Inference code for multiple exposure images #1

Open dat-lequoc opened 3 days ago

dat-lequoc commented 3 days ago

Hi @LucasKirsten,

Thank you for the paper and great work. This will be very helpful for my use case.

image

Ground truth : (brighter than the average, edited by human) image

I'm attempting to use the pretrained model on my dataset, which contains multi-exposure images. I have a few questions:

  1. In the paper, you mention the model was trained on at least 2 exposures. Given my data, which specific model in /h5 would you recommend for my use case?

  2. My photos are 5312x3552 pixels. Can the model handle this resolution, which is the best suitable GPUs for inference / train?

  3. Considering the differences between my data and your training set, do you think I should: a) Use the pretrained model as-is b) Fine-tune the pretrained model on my dataset c) Train a new model from scratch using my edited photo ground truth

Thank you very much for your insight 🤩!

LucasKirsten commented 3 days ago

Hi @dat-lequoc, thanks for your interest in MobileMEF!

  1. Your images seem similar to those of the SICE dataset, so I would recommend using either sice_ev1.h5 or sice_ev_most.h5. For choosing which to use, I also recommend testing different input setups of EV frames on each of the models to see which outputs suit you better.

  2. The models were trained with an input resolution of 4096x2816, so resizing the images to this resolution would give you the best possible results. However, I believe using other resolutions (such as the one of your photos) would result in a negligible loose of quality. I also tested inference on images with 4096x4096 in my laptop NVIDIA RTX A3000 GPU with 6GB and it was capable of handling the images fine.

  3. I recommend starting by using a pre-trained model and verifying which would suit you better (as I mentioned in 1.), and, if the results are not good enough, try to fine-tune the model based on a pre-trained checkpoint. I would not recommend training from scratch, since it can take several hours and require a GPU with high memory (40GB plus).

P.S.: for fine-tuning the model, unfortunately, I can not provide the training scripts, since they are proprietary to Motorola. But feel free to ask me about implementation issues you may have when trying to reproduce the paper.

dat-lequoc commented 3 hours ago

Thank you very much, @LucasKirsten, for your response.

  1. Could you please provide a requirements file (e.g., pip freeze) so I can easily install the necessary packages with pip? Conda isn't supported and can be difficult to set up for free GPU environments like Google Colab or Kaggle.

  2. How can I modify the code to handle 5 images instead of only 2 (under/over)? Will it works? Currently, I can only use 2 : image

Thank you so much again for your help. Have a nice day! 🤗🤗🤗

dat-lequoc commented 3 hours ago

(1) is solved, I used Runpod to make conda env, then pip freeze , etc ... Here's the working requirements.txt :

numpy==1.24.3
pandas==1.5.0
scipy==1.10.1
scikit-learn==1.2.2
tensorflow==2.10.0
keras==2.10.0
tensorboard==2.10.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow-estimator==2.10.0
Keras-Preprocessing==1.1.2
Pillow==9.4.0
requests==2.31.0
scikit-image==0.19.3
h5py==3.8.0
networkx==3.1
protobuf
grpcio
opencv-python
tqdm
dat-lequoc commented 3 hours ago

The output of the model is great, but it needs more training to adapt with professional editing ground truth. since_ev1.h5 : image since_ev_most.h5: image my ground truth: image