microsoft / aerial_wildlife_detection

Tools for detecting wildlife in aerial images using active learning
MIT License
230 stars 58 forks source link

unable to autoTrain #9

Open GallonDeng opened 5 years ago

GallonDeng commented 5 years ago

Hi, I set the "numImages_autotrain" to a small number(i.e., 5) to test the autoTrain function. My system is Ubuntu 16.04 and all the aide modules run on a single machine with one AIworker for detection task. But the autoTrain only ran once and never restart even new annotations were completed. It showed the trainning completed and task completed. Then I mannually started trainning process and it worked a few times but would get stuck if I restart the process (annotaion and then training) again. The status would be kept "PENDING" not "SUCCESS"

bkellenb commented 5 years ago

Hi,

Apologies for the delayed response. The "PENDING" message is usually down to the server engine (Gunicorn) spawning multiple threads. Essentially, the job gets scheduled and sent to an AIWorker by one thread, but potentially not broadcast to other, new threads. This only affects the status messages for the GUI, not the actual training, and we are working on a fix for it for the next release.

In the meantime, the default implementations of e.g. RetinaNet should print to the command line while training, so if you have the terminal window of the AIWorker open, you should see the training messages accordingly.