Abe404 / RootPainter3D

RootPainter3D: Interactive-machine-learning enables rapid and accurate contouring for radiotherapy
GNU General Public License v3.0
20 stars 10 forks source link

force_fg triggers too many retries exception which kills server #31

Closed jenspete closed 1 year ago

jenspete commented 1 year ago

Training kills the server on some of my projects, because images often have only background annotated. It might be worth considering if this should be treated as severely as a load failure.

Abe404 commented 1 year ago

The offending code is here: https://github.com/Abe404/RootPainter3D/blob/f95f856274cbcbe5a690281d749732e2bb96dad7/trainer/im_utils.py#L252

The mechanism here is that the system will retry if it didn't find any foreground, if force_fg is also true, which happens with a certain probability early on in training to avoid sampling too many background only patches early in training.

The problem (I believe) that is causing the server to crash is that the max_retries were for some reason set to 2.

https://github.com/Abe404/RootPainter3D/commit/bedb9dc66d0c7d8cb9bfde2d0b4c2eedfd99cd6a

I've put it back up to 200. I consider it unlikely that you will have 200 randomly sampled images without foreground. If so then I can come up with a different solution.

In general I don't think this retry solution is optimal, but hopefully this is a temporary fix.

Can you please let me know if it is working OK for you now.

jenspete commented 1 year ago

It seems to be training now, although the server output is a somewhat cluttered with exceptions, which might confuse. However, it is okay for me :).

Abe404 commented 1 year ago

Actually I agree these error messages aren't great. I will change the code so it is not an exception when foreground is not found.

Abe404 commented 1 year ago

Fixed so it retries without exceptions/warnings. Still in another branch though. will close this issue when it is merged into master

https://github.com/Abe404/RootPainter3D/commit/dd8362cce25e5883a95e2c34e0ef1d014eaf24c7