thtrieu / darkflow

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices
GNU General Public License v3.0
6.14k stars 2.07k forks source link

Load weights from 2 different sources (.weights and .ckpt) #485

Open fabiocapsouza opened 6 years ago

fabiocapsouza commented 6 years ago

I have been trying to fine-tune the tiny-yolo model to my dataset following these steps:

  1. Freeze all weights except the last convolutional layer (that acts as a classifier + bounding box regressor) and train until convergence (transfer learning)
  2. Unfreeze 2 or 3 convolutional layers before the last layer and train the model with smaller learning rate.

To freeze the layers, I changed the value of self.ntrain in build.py. The step 1 is needed because without it, since my dataset has different classes (I've changed num, classes and filters in the .cfg file and also the labels.txt), the random weights in the last layer could produce large gradients that would "destroy" the last 2 or 3 pretrained layers' weights.

To do that I'd need to load the checkpoint from step 1 to continue training in step 2. However, I inspected the checkpoint created after step 1 using tensorflow.python.tools.inspect_checkpoint.print_tensors_in_checkpoint_file function and it has only the weights of the last layers and the tensors of the optimizer (Adam). So when I use --load -1 at step 2, the outputs change completely because the first 22 layers aren't being loaded from anywhere (are garbage). This related to the issues #370 and #371 . The other weights, that were frozen, should be loaded from tiny-yolo-voc.weights, but it would also need the load argument.

I see 3 possible solutions:

  1. Make the load option capable of loading two files sequentially, e.g load a .weights file and then load a .ckpt file, that could replace some weights loaded by the first load. Is this possible? I ask this because I don't fully understand how TF graphs and sessions work. I tried to understand how the weights loader works but I couldn't figure out a way to do both.

  2. Have an option to save all the model parameters in a checkpoint, instead of only the trainable ones. This would create an overhead because if would save unmodified weights, but it would be extremely useful to do partial trainings.

  3. Have an option to export .weights file instead of only .ckpt and .pb/.meta. With this, one could change parameters in the .cfg file (like width and height) and test the model using the same weights for other configurations (I couldn't do it using .pb and .meta, for example. Is it possible?).

liuhantao9 commented 6 years ago

Do you know how to freeze the graph? I am having trouble getting .ckpt file from darkflow.

arthurfortes commented 4 years ago

3. ave an option to export .weights file instead of only .ckpt and .pb/.meta. With this, one could change parameters in the .cfg file (like width and height) and test the model using the same weights for other configurations (I couldn't do it using .pb and .meta, for example. Is it possible?).

Did you find any solution?