gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
265 stars 33 forks source link

reference_based_layout arguments #76

Closed RicardoGCCR closed 4 years ago

RicardoGCCR commented 4 years ago

Hi, I have a few questions regarding the improving of the layout with a ref genome: using: python ~/repos/panaroo/scripts/reference_based_layout.py 0 final_graph.gml capacity_cut_edges.txt --add_reference_edges

  1. Running it as it is, will it use the "pan_genome_reference.fa" as reference genome?
  2. How do I included a complete and curated genome to run this script? should I replace the "0" with the file name?
  3. If this is possible, which file types can I used for the ref genome?

Thanks, I really enjoy running Panaroo!

aweimann commented 4 years ago

Hi Ricardo,

Sorry for the little delay and thanks for your interest in Panaroo! The first argument to the script is the id of the target genome, which is just the order of the input GFF in the original Panaroo call. So in this case since it's 0 it will use the first genome . I know this is a bit convoluted. I plan to change this so that you can use the actual name of the input genome. You can also check manually by inspecting the header of the final_graph.gml file. This lists all input genomes.

In your case it depends on whether you included the complete and curated genome you're talking about in the original Panaroo run. If not you can just re-run/or add it using panaroo-integrate.

I hope that helps. Please let us know if it works.

RicardoGCCR commented 4 years ago

Hi aweimann, Thank you for your explanation. Everything is clear now, I will add the ref genome accordingly. Best regards, Ricardo