igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
646 stars 387 forks source link

entries in FASTA not listed in order #1489

Closed kevfengler227 closed 3 months ago

kevfengler227 commented 9 months ago

this is a difference between 2.16.0 and 2.17.2 (haven't checked in-between). The order of chromosomes is not the same as in FASTA or even alphabetical. It appears random?

image

jrobinso commented 9 months ago

Noted. How is your genome specified, by json, .genome, or are you just loading a fasta?

The order in the fasta has never been the only determinate, many fastas are alphabetical and have chr11 before chr2 for example.

kevfengler227 commented 9 months ago

I load my genomes as a fasta. I name my genomes such to avoid this problem, so that they are both alphabetically named and placed in that order in the fasta.

So with 2.16.0 the displayed view was the same as the order in the fasta, now with 2.17.2 it is random. Here is the same genome.fasta loaded into 2.16.0 so something has changed

image

kevfengler227 commented 9 months ago

I think the same order as the FASTA would be the expected behavior. Ideally folks put it in the order they want it.

jrobinso commented 9 months ago

Yes noted, definitely something has changed. Consider it a bug. Most folks are not creating their own fastas, they are just loading them from a repository, but I agree if you are creating your own that would be the expected behavior. I will fix it for the next point release. BTW technically the order in the fasta never controlled this, it was the order in the index (fai) file. I would rearrange the order of the index entries, which is much easier than rearranging the order in a large fasta such as human. That is the behavior I will restore.

When using ".json" format you can specify the order in a property. The deprecate ".genome" format had a flag to control this.

kevfengler227 commented 9 months ago

OK, sounds good. thanks.

In this case the fai index has the same order as the fasta so I wonder how they are being re-arranged in the latest version?

Chr01A 78365069 8 100 101 Chr01B 73519017 79148736 100 101 Chr01C 77632021 153402952 100 101 Chr01D 78479562 231811302 100 101 Chr02A 77732718 311075668 100 101 Chr02B 69785945 389585722 100 101 Chr02C 69903205 460069535 100 101 Chr02D 71286248 530671781 100 101 Chr03A 69408000 602670900 100 101 Chr03C 67647329 672772988 100 101 Chr03D 69084013 741096799 100 101 Chr04A 69046420 810871661 100 101 Chr04B 63196102 880608554 100 101 Chr04C 79247115 944436626 100 101 Chr04D 67958013 1024476221 100 101 Chr05A 61429933 1093113823 100 101 Chr05B 58332174 1155158064 100 101 Chr05C 56126950 1214073568 100 101 Chr05D 63837079 1270761796 100 101 Chr06A 56548238 1335237254 100 101 Chr06B 53066733 1392350983 100 101 Chr06C 53128773 1445948392 100 101 Chr06D 48393341 1499608461 100 101 Chr07A 51432668 1548485744 100 101 Chr07B 46706416 1600432747 100 101 Chr07C 49660961 1647606236 100 101 Chr07D 47082048 1697763815 100 101 Chr08A 47769371 1745316692 100 101 Chr08B 51484660 1793563765 100 101 Chr08C 50772662 1845563280 100 101 Chr08D 48854746 1896843677 100 101

jrobinso commented 9 months ago

Sorry if I've not been clear, this has changed in the latest version, as I said consider it a bug. I was referring to previous versions. At the moment you can continue to use 2.16 until the next point release, probably early March, or alternatively wait until this is fixed in the development build. I will update here when a fix is available in the development build.

kevfengler227 commented 9 months ago

I see. Thanks!

jrobinso commented 9 months ago

If you are curious about sorting you can follow the logic here. I think the leading zero in your numeric names might be throwing this off (07 vs 7), but I'm not sure. In any event I'll restore the previous behavior if loading directly from a fasta.

https://github.com/igvteam/igv/blob/master/src/main/java/org/broad/igv/feature/genome/ChromosomeComparator.java https://github.com/igvteam/igv/blob/master/src/main/java/org/broad/igv/feature/genome/ChromosomeNameComparator.java

zongzone commented 6 months ago

2.17.3 and 2.17.4 still have this problem

jrobinso commented 6 months ago

@zongzone What problem exactly?

zongzone commented 6 months ago

@zongzone What problem exactly?

entries in FASTA not listed in order

jrobinso commented 6 months ago

OK, I don't see a fix for this released, its still open. This will be closed when there is a fix, hopefully soon.

jrobinso commented 3 months ago

This should be fixed with commit aeb2add5