schneebergerlab / plotsr

Tool to plot synteny and structural rearrangements between genomes
MIT License
288 stars 28 forks source link

Visualizing the comparison between two haplotypes and --notr or --nodup didn't work #5

Closed Yutang-ETH closed 2 years ago

Yutang-ETH commented 2 years ago

Hi Manish,

chr1_syri_new.pdf

I visualized the output from SyRI by plotsr and I found there are a lot of translocations or duplications between my two haplotypes, does this make sense? The PAV track in my plot are actually NOTAL (none aligned regions between two homologous chromosomes, Nucmer4 was used for alignment). There are more than 100 Mb not aligned.

I wanted not to show translocations or duplications, so I added --notr and --nodup, but it didn't work and I got error like this:

Traceback (most recent call last): File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 10

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/yutachen/anaconda3/envs/python/bin/plotsr", line 6, in main(sys.argv[1:]) File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/plotsr/main.py", line 44, in main plotsr(args) File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/plotsr/plotsr.py", line 134, in plotsr alignments[i][1] = filterinput(args, alignments[i][1], chrids[i][1]) File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/plotsr/func.py", line 688, in filterinput df = df.loc[~df[10].isin(['TRANS', 'INVTR'])] File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/pandas/core/frame.py", line 3458, in getitem indexer = self.columns.get_loc(key) File "/home/yutachen/anaconda3/envs/python/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: 10

My command is this: plotsr --sr chr1_syri.out --genomes chr1_genomes.txt --tracks chr1_tracks.txt --cfg chr1.cfg -H 3 -W 10 -o chr1_syri_new.pdf -S 0.2 --notr --nodup

Did I use the command correctly? Thank you very much.

Best wishes, Yutang

mnshgl0110 commented 2 years ago

This bug should be fixed here 89551710117fbd48729918519fe140b80e85a5eb Please try the updated version.

Regarding translocations and duplications (TDs), yes that is possible if there are many TEs in the genome or if the genomes are too diverged. However, I would suggest to try nucmer3 as the alignments from nucmer4 are sometimes incorrect.

Yutang-ETH commented 2 years ago

Hi Manish,,

Thank you very much for your answer. I see, I will try the latest version. I installed my current plotsr via conda. I will try the updated version and give the feedback to you.

Regarding our genome, it is indeed very divergent! It's heterozygosity is 3.8% estimated by genomescope2. I am actually curious about how divergent the two haplotypes is. I could try nucmer3 but I am not sure how better it will reach.

Best wishes, Yutang

Yutang-ETH commented 2 years ago

chr1_syri_new.pdf

It worked after installing the package downloaded from github.

Thank you very much for your help and also this nice tool

Best wishes, Yutang