jzhoulab / orca

sequence-based prediction of multiscale genome structure from kilobase to whole-chromosome scale
Other
70 stars 21 forks source link

Model Comparison & CLI #2

Closed X3N1A closed 1 year ago

X3N1A commented 1 year ago

Hello,

thank you for uploading your models. I have been testing them and noticed some things in orca_predict.py

  1. When calling with the CLI some issues come up when calling the 256 Mb model

    • line #2989: I added a variable MOD = "256M" if arguments["--256m"] else "32M" so that the correct resources get called
    • load_resources(models=["32M"], use_cuda=use_cuda)
    • line #1064: I believe this line should read if target instead of if has_target
  2. When calling both the 32 Mb and the 256 Mb model on the same region with default wpos and mpos, the center point of the predictions at 128 kb resolution should be the same correct?

    • For instance: for chr22:1-50818468, the 128 kb predictions of both models should be centered at 25409234. This is only true for the former:
    • 32 Mb model: --> 9409234 to 41409234
    • 256 Mb model: --> 9216000 to 41216000

Please let me know if I'm mistaken. Thank you!

jzthree commented 1 year ago

Thanks for catching the bugs for using the 256Mb model. I Both issues in point 1 should be fixed now.

For 2, yes 32Mb and 256model won't necessarily center on the same point. This is because for 256Mb model the prediction of 32Mb is aligned with the grid of 256Mb prediction, therefore in this case 256Mb model coordinates are multipliers of 1024000.

X3N1A commented 1 year ago

Thank you for clarifying.