ml-struct-bio / drgnai

GNU General Public License v3.0
16 stars 2 forks source link

Failure to run train related to tkinter #2

Open mpm116 opened 2 weeks ago

mpm116 commented 2 weeks ago

I've played about with number of workers/threads (including 1 worker and 1 thread) but not really sure what to try. Attached is run log:

(INFO) (reconstruct.py) (18-Jun-24 12:55:11) # =====> SGD Epoch: -1 finished in 0:03:03.966651; total loss = 1.087761 (INFO) (analysis.py) (18-Jun-24 12:55:15) Explained variance ratio: (INFO) (analysis.py) (18-Jun-24 12:55:15) [0.26263918 0.25768008 0.24865601 0.23102474] (INFO) (reconstruct.py) (18-Jun-24 12:55:21) Will use pose search on 57795 particles (INFO) (reconstruct.py) (18-Jun-24 12:55:21) Will make a full summary at the end of this epoch Exception ignored in: <function Image.del at 0x7f70a700aee0> Traceback (most recent call last): File "/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/tkinter/init.py", line 4017, in del self.tk.call('image', 'delete', self.name) RuntimeError: main thread is not in main loop Exception ignored in: <function Variable.del at 0x7f70a6ff25e0> Traceback (most recent call last): File "/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/tkinter/init.py", line 363, in del if self._tk.getboolean(self._tk.call("info", "exists", self._name)): RuntimeError: main thread is not in main loop Exception ignored in: <function Variable.del at 0x7f70a6ff25e0> Traceback (most recent call last): File "/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/tkinter/init.py", line 363, in del if self._tk.getboolean(self._tk.call("info", "exists", self._name)): RuntimeError: main thread is not in main loop Exception ignored in: <function Variable.del at 0x7f70a6ff25e0> Traceback (most recent call last): File "/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/tkinter/init.py", line 363, in del if self._tk.getboolean(self._tk.call("info", "exists", self._name)): RuntimeError: main thread is not in main loop Exception ignored in: <function Variable.del at 0x7f70a6ff25e0> Traceback (most recent call last): File "/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/tkinter/init.py", line 363, in del if self._tk.getboolean(self._tk.call("info", "exists", self._name)): RuntimeError: main thread is not in main loop Tcl_AsyncDelete: async handler deleted by the wrong thread Aborted (core dumped)

This then automatically runs, before everything hangs:

/hlowdata4/mpm116/software/miniconda3/envs/drgnai-env/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 11 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Any suggestions welcome and of course if any more information required please let me know! Thanks!

mpm116 commented 2 weeks ago

Update:

I removed 'tk' and reinstalled it in the conda environment. This also meant for some reason I had to reinstall python 3.9... It now seems to be stable, will report back on this

Currently running with 1 worker and 16 threads...

Further update: Same error occured after:

(INFO) (reconstruct.py) (18-Jun-24 18:41:21) # [Train Epoch: 0/108] [19904/57795 particles]

Even when setting number of workers and threads to 1, I still see two pt_main_thread processes running