MouseLand / cellpose

a generalist algorithm for cellular segmentation with human-in-the-loop capabilities
https://www.cellpose.org/
BSD 3-Clause "New" or "Revised" License
1.24k stars 359 forks source link

Training argument --save_each not working using Cellpose 3 #924

Open dfgdgdfgd opened 2 months ago

dfgdgdfgd commented 2 months ago

Hi,

in Cellpose 2 you could use the training argument --save_every xx and --save_each to automatically save after a given number of epochs, which worked very nicely. Unfortunately, the last argument --save_each no longer works in Cellpose 3 (error: main.py: error: unrecognised arguments: --save_each). If I omit the argument and only use --save_every xx, then just one model is saved at the end.

Any idea how to solve the problem?

Thanks & best, Mario

alscunha commented 1 month ago

The problem now with 3.0 is that it uses the same model name to save regardless of epoch iteration number, iepoch in the code. While we wait for an update on that I patched lib/pythonX.XX/site-packages/cellpose/train.py under the cellpose installation directory to save current model using filename 'model_path_save' as below

       if iepoch > 0 and iepoch % save_every == 0: 
            model_path_save = f"{model_path}_{iepoch}"
            net.save_model(model_path_save)

This will append the epoch number to the output name provided by either --model_name_out or the default one. Below is the patch file (somehow I cannot attach it even though .patch is an acceptable extension):

--- train.py    2024-05-01 23:46:07.605886495 -0700
+++ /home/XXXX/train.py 2024-05-02 17:28:24.161011936 -0700
@@ -475,7 +475,8 @@
             lavg, nsum = 0, 0

         if iepoch > 0 and iepoch % save_every == 0:
-            net.save_model(model_path)
+            model_path_save = f"{model_path}_{iepoch}"   
+            net.save_model(model_path_save)
     net.save_model(model_path)

     return model_path
dfgdgdfgd commented 1 month ago

Great, thank you very much!!