pangenome / odgi

Optimized Dynamic Genome/Graph Implementation: understanding pangenome graphs
https://doi.org/10.1093/bioinformatics/btac308
MIT License
196 stars 40 forks source link

Visualise specific positions of a chromsome #499

Open hzlnutspread opened 1 year ago

hzlnutspread commented 1 year ago

Hi there, is there some way to drill down and get a closer look at specific regions of a chromosome? For example, if I wanted to look more closely at chromsome 3: 1000-100000 would this be possible using odgi?

For context I am using a full genome fasta file. grep ">" on that file gives me chr01-chr29.

Thanks very much

subwaystation commented 1 year ago

Hi @hzlnutspread, please take a look at this tutorial: https://odgi.readthedocs.io/en/latest/rst/tutorials/extract_selected_loci.html.

hzlnutspread commented 1 year ago

Thanks @subwaystation have been trying and playing around for a while.

I know that the particular location of the chromsome I would like to look at is between: 11,463,157 - 11,982,640 on chromosome 3 (length 519,483).

I do not know what the node ID is and I doubt it would be one singular node

How would you suggest I go about this?

I have tried ./odgi/odgi.sif odgi extract -i "${input_file}" -r 11463157-11982640 -c 1 -d 0 -o "${output_path}/ME02_hapA_chr3_${node_id}.og"

however the error says that the range can not be found.

Any idea of how I could do this?

AndreaGuarracino commented 1 year ago

With the '-r' option you have to specify the path name too, so something like '-r chr03:11463157-11982640'.


From: hzlnutspread @.> Sent: 03 May 2023 21:22 To: pangenome/odgi @.> Cc: Subscribed @.***> Subject: Re: [pangenome/odgi] Visualise specific positions of a chromsome (Issue #499)

Thanks @subwaystationhttps://github.com/subwaystation have been trying and playing around for a while.

I know that the particular location of the chromsome I would like to look at is between: 11,463,157 - 11,982,640 on chromosome 3 (length 519,483).

I do not know what the node ID is and I doubt it would be one singular node

How would you suggest I go about this?

I have tried ./odgi/odgi.sif odgi extract -i "${input_file}" -r 11463157-11982640 -c 1 -d 0 -o "${output_path}/ME02_hapAchr3${node_id}.og"

however the error says that the range can not be found.

Any idea of how I could do this?

— Reply to this email directly, view it on GitHubhttps://github.com/pangenome/odgi/issues/499#issuecomment-1533991829, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AO26XHUMXRZENDOV7EYAGQ3XEMHFFANCNFSM6AAAAAAXEXLPYI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

hzlnutspread commented 1 year ago
Screen Shot 2023-05-04 at 2 32 39 PM

Thanks! That actually worked and there is no error. However, I am trying to do some analysis on this given sex locus on chromsome 3 as seen above. When I run that command I simply get a whole list of path ranges but no node IDs. Could you suggest a different set of command for me to run in order to visualise more in depth that particular range? Thanks very much!!

subwaystation commented 1 year ago

Hi @hzlnutspread Are you sure this is the extracted graph? Usually, path positions appear in the path name after the extraction.

If it is a complex locus, you may have to add -E or select larger values for -c or -d. That depends on the graph. Maybe the trials in https://github.com/pangenome/odgi/issues/478 shine more light on your problem?

hzlnutspread commented 1 year ago

@subwaystation hi there! No this is not the extracted path. This is the full chromosome .og file that has been visualised. I am wanting to know if there is a standard/good way of extracting a portion of this graph or visualising the specific part in the middle of it where you can see gaps exist for some whereas for P8hap2A_chr3_Ragtag there is something there (assuming we know the exact positions of it)

Also, I wanted to ask how exactly the sorting algorithm works. It seems to do some funky stuff when applied to some of the genomes I've tested. It shouldn't happen to shift things around would it or truncate/remove parts of a genome/chromsome right?

subwaystation commented 1 year ago

With the command line you mentioned above plus @AndreaGuarracino's suggestion, you should already be able to do so. Or was the tutorial not detailed enough? I agree that you graph is really messy, so sorting it first is the way to go. Which sorting algorithm did you apply? All the algorithm does is to re-arrange the nodes in 1D. That's it. It does not add or remove nodes, paths, steps, edges, ...

If it helps, you can send the graph to me and I can try to extract your region in question.

hzlnutspread commented 1 year ago

@subwaystation hi there, how do I go about sending you the file?

subwaystation commented 1 year ago

If possible, please send it to simon.heumos@qbic.uni-tuebingen.de. @hzlnutspread