Open ColdPopeye opened 1 year ago
Did you run a continuation job with the visualization, when this error appeared? Dynamight only downsamples the output volumes that are saved, because of gpu memory.
At the moment continuation from epoch XX is not possible, but this will be added soon.
Did you run a continuation job with the visualization, when this error appeared? Dynamight only downsamples the output volumes that are saved, because of gpu memory.
At the moment continuation from epoch XX is not possible, but this will be added soon.
Yes, visualization on the last epoch was fine and now running the inverse-deformation estimation. I also tried to run a similar thing outside of relion (I did see I can add a mask so I did that as extra):
(from the conda environment relion-5.0)
dynamight optimize-deformations --refinement-star-file Refine3D/job213/run_data.star --output-directory DynaMight/job224/ --initial-model Refine3D/job213/run_class001.mrc --n-gaussians 10000 --mask-file mask_job213_run_class001.mrc --gpu-id 1
But the error was the same Qt error and again once at Epoch 27 and once at Epoch 1:
Cannot load backend 'QtAgg' which requires the 'qt' interactive framework, as 'headless' is currently running
Installed Xvbf and ran: Xvfb :1 -screen 0 1280x1024x24 & export DISPLAY=:1 DISPLAY=:1 dynamight optimize-deformations --refinement-star-file Refine3D/job213/run_data.star --output-directory DynaMight/job224/ --initial-model Refine3D/job213/run_class001.mrc --n-gaussians 10000 --mask-file mask_job213_run_class001.mrc --gpu-id 1
seems to prevent it from crashing on my server. Still no idea why it needs a fake display during the run, I assume is writing some files that for some reason require a display? Not very familiar with the Qt
I am trying the DynaMight job implemented in Relion 5 on a symmetric dataset and I am running into an error that does not give much explanations to try to solve by myself. Is there any way to continue a stopped Dynamight from the epoch XX? The Qt error which I am not sure how to solve seems to come at random times, once was almost at the start (epoch ~3 which lead me to re-install Qt with sudo apt-get install qtbase5-dev qtchooser qt5-qmake qtbase5-dev-tools just in case) and second time at epoch 23.
Is there something I am missing with the Qt install on this machine?
Environment:
Dataset:
Job options:
note.txt
in the job directory):In the run.err I get two warnings: one related to how Dynamight bins the sample (not sure why it automatically bins differently I will try to bin before running for next run):
/home/relion/miniconda3/envs/relion-5.0/lib/python3.10/site-packages/dynamight/models/decoder.py:233: UserWarning: Using a target size (torch.Size([275, 275, 275])) that is different to the input size (torch.Size([274, 274, 274])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. loss = torch.nn.functional.mse_loss(
/home/relion/miniconda3/envs/relion-5.0/lib/python3.10/site-packages/dynamight/models/decoder.py:233: UserWarning: Using a target size (torch.Size([137, 137, 137])) that is different to the input size (torch.Size([136, 136, 136])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
/home/relion/miniconda3/envs/relion-5.0/lib/python3.10/site-packages/torch/nn/functional.py:3737: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.