Open Guillawme opened 2 years ago
Maybe we could have a flag in cryodrgn analyze
for the start index of the volume numbering.
@Guillawme, @zhonge - ok, so I'm seeing 3 places where volumes are generated by cryoDRGN
and this option (start index for volume numbering) would take effect:
cryodrgn analyze
command:
kMeans<k>
/ folder, where k
is the number of k-means samples generated.
pc<1>
to pc<N>
folders, where N
is the number of PC traversals generated.cryodrgn eval_vol
command:
<output_dir>/
folder, where output_dir
is the output directory specified to this command.Let me know if there are any more.
EDIT - I'm seeing one more place where this functionality would need to be added to keep things consistent:
cryodrgn analyze_landscape
command - this is essentially similar to analyze
in that it generates many volumes in kMeans<k>
and pc<1>
to pc<N>
folders.
This is all I can think of too.
Now another question: since this is becoming an option, what should be the default value? 0, to keep behavior consistent with previous versions of cryoDRGN? Or 1, to reduce friction by default?
I'd say we keep it at 0
for now. I'll add the rationale for this flag in the documentation so you (and other users like you who're using chimera) and benefit from it by overriding it. At some future point we can modify the default to be 1
.
@Guillawme - we added a --vol-start-index
flag (default value 0
) to cryodrgn analyze
command. Can you try this out and see if it addresses your use case? If so, I'll close this issue.
Hello!
Very sorry it took me so long to get back to this.
I have finally tested it, and it works nicely. This is so much easier to read now:
I think it will be beneficial if the default value becomes 1 in a future version.
Re-opening this issue because the numbering is still off in the UMAP and PCA plots found in the kmeans
directory after running cryodrgn analyze
. This is apparent in the very small cluster on the left in this UMAP plot:
And also in this PCA plot:
While the volumes from this job start at vol_001.mrc
because I used --vol-start-index 1
.
Describe the bug This is not a bug report, only a proposal for a little improvement in user experience.
CryoDRGN numbers all the maps it generates from 0, but ChimeraX numbers all the maps it opens from 1, and this trips me up every single time I look at maps from cryoDRGN.
To Reproduce
Then,
vol_000.mrc
has model ID 1 in ChimeraX,vol_001.mrc
has model ID 2, and so on. I always look at the model ID column in the model panel in ChimeraX, but the number in this column doesn't match the number in the UMAP plot.Expected behavior It would be a lot easier if the numbering from cryoDRGN started at
vol_001.mrc
. Or if ChimeraX numbered its open models from 0, if you can convince them that they should be the ones changing their software.Additional context It is definitely possible to look at the filename column, where the correct
vol_???.mrc
are listed, instead of the model ID column. But it is confusing when one got into the habit of ignoring the filename column (I got into this habit because output files from two RELION jobs have the same names, so in this situation only the model ID column is informative to remember which file comes from which job).