3dem / model-angelo

Automatic atomic model building program for cryo-EM maps
MIT License
116 stars 18 forks source link

config.json file #95

Open chenwei-zhang opened 8 months ago

chenwei-zhang commented 8 months ago

Hi, in config.json file, I don't quite understand the meaning of "crop": 6 in dict("ca_infer_args") and "crop_length": 200, "aggressive_pruning": false, in dict("gnn_infer_args":). Could you please give me some hints? Also, if I would like to prune some short chains. how can I set up the threshold?

Thank you!

jamaliki commented 8 months ago

Hi!

Could you clarify what you mean about pruning the short chains? Would you like to prune all chains shorted than N residues from the output CIF file? I don't believe there is such an option but it would be simple for me to write a quick python script for that if you like.

Best, Kiarash

chenwei-zhang commented 8 months ago

Hi Kiarash, Thanks for your rapid reply.

  1. I I would like to generate the structures in the ModelAngelo ICLR paper. In the paper you compared pruned and unpruned predictions, I am wondering how do you do this pruning? I found a sentence in the paper saying "chains shorter than 4 residues are pruned and the resulting coordinates are used as the input". May I ask if this is corresponding to the pruned prediction? Is there any way I could customize thecutting threshold 4 residues? And if I directly use the latest version of ModelAngelo without changing any configuration, will this generate the pruned or unpruned structures?
  2. For the results you show in the paper, may I ask if you use the original map (e.g. emd_26126.map) as the input for inference directly, or you use the postprocessed map? If latter, could you give some details how you postprocess the map?

Sorry for so many questions, but ModelAngelo is an awesome work and I really appreciate. Thank you in advance for your answers.

Best, Chenwei

jamaliki commented 7 months ago

Hi @chenwei-zhang ,

  1. So pruned and unpruned refers to the output files. The pruned file is output.cif and the unpruned file is output_raw.cif. It is not quite clean if you want to change the cutting threshold, although I can point you to where in the code you would have to change if you like? Specifically, if you go to the class MatchToSequence, you will find the bulk of the pruning code. I'm sorry it is not very clean :disappointed:
  2. We always used post-processed files as deposited in the EMDB. For example, for EMD-26126, it would be this file: emdb_26126.map. The only post-processing ModelAngelo will really do is to change the pixel size.