Closed mariamadrid19 closed 4 months ago
Hi @mariamadrid19,
Thanks for reaching out! I have seen that before - I thought I had fixed the issue, but looks like there could still be some considerations I'm missing for some inputs..
A couple follow-up questions to us troubleshoot:
{prefix}.links.tsv
file that is generated prior to running the R script? Usually, the issue is a sorting problem, but that would help me confirm thatThank you for your interest in ntSynt! Lauren
Hi Lauren, Thanks for quick response!
block_id seq_id bin_id start end seq_id2 bin_id2 start2 end2 strand block_ori colour_block 0 scaffold_1 dh1_50_scaffolds.fasta 127771 10241253 scaffold_1_RagTag sorted_dh2.fasta 353864 10513305 + + scaffold_1 0 scaffold_1_RagTag sorted_dh2.fasta 353864 10513305 scaffold_1_RagTag sorted_nh1.fasta 391161 10456794 + + scaffold_1 0 scaffold_1_RagTag sorted_nh1.fasta 391161 10456794 scaffold_1_RagTag sorted_nh2.fasta 481757 10635563 + + scaffold_1 1 scaffold_1 dh1_50_scaffolds.fasta 12669193 14497695 scaffold_1_RagTag sorted_dh2.fasta 10556355 12368144 + + scaffold_1 1 scaffold_1_RagTag sorted_dh2.fasta 10556355 12368144 scaffold_1_RagTag sorted_nh1.fasta 12943737 14741589 + + scaffold_1 1 scaffold_1_RagTag sorted_nh1.fasta 12943737 14741589 scaffold_1_RagTag sorted_nh2.fasta 13047019 14852858 + + scaffold_1 2 scaffold_1 dh1_50_scaffolds.fasta 16898655 17501766 scaffold_1_RagTag sorted_dh2.fasta 12408555 12988155 + + scaffold_1 2 scaffold_1_RagTag sorted_dh2.fasta 12408555 12988155 scaffold_1_RagTag sorted_nh1.fasta 17190019 17784012 + + scaffold_1 2 scaffold_1_RagTag sorted_nh1.fasta 17190019 17784012 scaffold_1_RagTag sorted_nh2.fasta 15820487 16417467 + + scaffold_1 3 scaffold_1 dh1_50_scaffolds.fasta 17523763 20283963 scaffold_1_RagTag sorted_dh2.fasta 13004164 15889460 + + scaffold_1 3 scaffold_1_RagTag sorted_dh2.fasta 13004164 15889460 scaffold_1_RagTag sorted_nh1.fasta 17797545 20663198 + + scaffold_1 3 scaffold_1_RagTag sorted_nh1.fasta 17797545 20663198 scaffold_1_RagTag sorted_nh2.fasta 17978199 20924186 + + scaffold_1 4 scaffold_1 dh1_50_scaffolds.fasta 20352919 20773473 scaffold_1_RagTag sorted_dh2.fasta 15922344 16347458 - - scaffold_1 4 scaffold_1_RagTag sorted_dh2.fasta 15922344 16347458 scaffold_1_RagTag sorted_nh1.fasta 20690755 21114805 + - scaffold_1 4 scaffold_1_RagTag sorted_nh1.fasta 20690755 21114805 scaffold_1_RagTag sorted_nh2.fasta 20975701 21402802 + - scaffold_1 5 scaffold_1 dh1_50_scaffolds.fasta 20807163 22392208 scaffold_1_RagTag sorted_dh2.fasta 16415892 18070887 + + scaffold_1 5 scaffold_1_RagTag sorted_dh2.fasta 16415892 18070887 scaffold_1_RagTag sorted_nh1.fasta 21128295 22747929 + + scaffold_1 5 scaffold_1_RagTag sorted_nh1.fasta 21128295 22747929 scaffold_1_RagTag sorted_nh2.fasta 21459278 23158762 + + scaffold_1 6 scaffold_10 dh1_50_scaffolds.fasta 1420700 4717528 scaffold_10_RagTag sorted_dh2.fasta 3018 3310683 + + scaffold_10 6 scaffold_10_RagTag sorted_dh2.fasta 3018 3310683 scaffold_10_RagTag sorted_nh1.fasta 3177135 6523089 + + scaffold_10 6 scaffold_10_RagTag sorted_nh1.fasta 3177135 6523089 scaffold_10_RagTag sorted_nh2.fasta 3261020 6637663 + + scaffold_10 7 scaffold_10 dh1_50_scaffolds.fasta 5087916 6120902 scaffold_10_RagTag sorted_dh2.fasta 3327575 4338644 + + scaffold_10 7 scaffold_10_RagTag sorted_dh2.fasta 3327575 4338644 scaffold_10_RagTag sorted_nh1.fasta 6877692 7861286 + + scaffold_10 7 scaffold_10_RagTag sorted_nh1.fasta 6877692 7861286 scaffold_10_RagTag sorted_nh2.fasta 7009219 8013787 + + scaffold_10 8 scaffold_10 dh1_50_scaffolds.fasta 6382001 6754520 scaffold_10_RagTag sorted_dh2.fasta 4578058 4942654 + + scaffold_10 8 scaffold_10_RagTag sorted_dh2.fasta 4578058 4942654 scaffold_10_RagTag sorted_nh1.fasta 7861773 8276295 + + scaffold_10 8 scaffold_10_RagTag sorted_nh1.fasta 7861773 8276295 scaffold_10_RagTag sorted_nh2.fasta 8273514 8667302 + + scaffold_10 9 scaffold_10 dh1_50_scaffolds.fasta 7091432 9468155 scaffold_10_RagTag sorted_dh2.fasta 4952344 7311704 + + scaffold_10 9 scaffold_10_RagTag sorted_dh2.fasta 4952344 7311704 scaffold_10_RagTag sorted_nh1.fasta 8696863 11108309 + + scaffold_10 9 scaffold_10_RagTag sorted_nh1.fasta 8696863 11108309 scaffold_10_RagTag sorted_nh2.fasta 9092319 11466626 + + scaffold_10 10 scaffold_10 dh1_50_scaffolds.fasta 9567861 10314791 scaffold_10_RagTag sorted_dh2.fasta 7538746 8303917 + + scaffold_10 10 scaffold_10_RagTag sorted_dh2.fasta 7538746 8303917 scaffold_10_RagTag sorted_nh1.fasta 11203238 11904562 + + scaffold_10 10 scaffold_10_RagTag sorted_nh1.fasta 11203238 11904562 scaffold_10_RagTag sorted_nh2.fasta 11640620 12409494 + + scaffold_10 11 scaffold_10 dh1_50_scaffolds.fasta 10388709 10759026 scaffold_14_RagTag sorted_dh2.fasta 1921821 2294353 + + scaffold_10 11 scaffold_14_RagTag sorted_dh2.fasta 1921821 2294353 scaffold_10_RagTag sorted_nh1.fasta 11919620 12267090 + + scaffold_10 11 scaffold_10_RagTag sorted_nh1.fasta 11919620 12267090 scaffold_10_RagTag sorted_nh2.fasta 12468715 12830402 + + scaffold_10 12 scaffold_11 dh1_50_scaffolds.fasta 1626 936317 scaffold_11_RagTag sorted_dh2.fasta 9082717 10035245 + + scaffold_11 12 scaffold_11_RagTag sorted_dh2.fasta 9082717 10035245 scaffold_11_RagTag sorted_nh1.fasta 9149473 10077785 + + scaffold_11 12 scaffold_11_RagTag sorted_nh1.fasta 9149473 10077785 scaffold_23_RagTag sorted_nh2.fasta 8668325 9609365 + + scaffold_11 13 scaffold_11 dh1_50_scaffolds.fasta 1117585 3227950 scaffold_11_RagTag sorted_dh2.fasta 10227841 12337977 + + scaffold_11 13 scaffold_11_RagTag sorted_dh2.fasta 10227841 12337977 scaffold_11_RagTag sorted_nh1.fasta 10242511 12337086 + + scaffold_11 13 scaffold_11_RagTag sorted_nh1.fasta 10242511 12337086 scaffold_23_RagTag sorted_nh2.fasta 9642751 11758037 + + scaffold_11 14 scaffold_11 dh1_50_scaffolds.fasta 3365684 3610293 scaffold_11_RagTag sorted_dh2.fasta 12579399 12825768 + + scaffold_11 14 scaffold_11_RagTag sorted_dh2.fasta 12579399 12825768 scaffold_11_RagTag sorted_nh1.fasta 12419983 12659858 + + scaffold_11 14 scaffold_11_RagTag sorted_nh1.fasta 12419983 12659858 scaffold_11_RagTag sorted_nh2.fasta 146629 421083 + + scaffold_11 15 scaffold_11 dh1_50_scaffolds.fasta 3827112 3978375 scaffold_11_RagTag sorted_dh2.fasta 12911321 13062925 + + scaffold_11 15 scaffold_11_RagTag sorted_dh2.fasta 12911321 13062925 scaffold_11_RagTag sorted_nh1.fasta 12943806 13157704 + + scaffold_11 15 scaffold_11_RagTag sorted_nh1.fasta 12943806 13157704 scaffold_11_RagTag sorted_nh2.fasta 573870 725715 + + scaffold_11 16 scaffold_11 dh1_50_scaffolds.fasta 3978848 4046659 scaffold_11_RagTag sorted_dh2.fasta 13063398 13131145 + - scaffold_11 16 scaffold_11_RagTag sorted_dh2.fasta 13063398 13131145 scaffold_11_RagTag sorted_nh1.fasta 14136737 14207142 - - scaffold_11 16 scaffold_11_RagTag sorted_nh1.fasta 14136737 14207142 scaffold_11_RagTag sorted_nh2.fasta 726188 799664 - - scaffold_11 17 scaffold_11 dh1_50_scaffolds.fasta 4082815 4263724 scaffold_11_RagTag sorted_dh2.fasta 13167332 13353628 + + scaffold_11 17 scaffold_11_RagTag sorted_dh2.fasta 13167332 13353628 scaffold_11_RagTag sorted_nh1.fasta 14244453 14434749 + + scaffold_11 17 scaffold_11_RagTag sorted_nh1.fasta 14244453 14434749 scaffold_11_RagTag sorted_nh2.fasta 835842 1027534 + + scaffold_11 18 scaffold_11 dh1_50_scaffolds.fasta 4958836 5018132 scaffold_11_RagTag sorted_dh2.fasta 14045546 14104805 + - scaffold_11 18 scaffold_11_RagTag sorted_dh2.fasta 14045546 14104805 scaffold_11_RagTag sorted_nh1.fasta 15344045 15415395 - - scaffold_11 18 scaffold_11_RagTag sorted_nh1.fasta 15344045 15415395 scaffold_11_RagTag sorted_nh2.fasta 1920062 1991302 + - scaffold_11 19 scaffold_11 dh1_50_scaffolds.fasta 5262159 5754322 scaffold_11_RagTag sorted_dh2.fasta 14344236 14836685 + + scaffold_11 19 scaffold_11_RagTag sorted_dh2.fasta 14344236 14836685 scaffold_11_RagTag sorted_nh1.fasta 14439090 14944729 + + scaffold_11 19 scaffold_11_RagTag sorted_nh1.fasta 14439090 14944729 scaffold_11_RagTag sorted_nh2.fasta 1031941 1525539 + + scaffold_11 20 scaffold_11 dh1_50_scaffolds.fasta 5755719 5818193 scaffold_11_RagTag sorted_dh2.fasta 14838082 14894928 + + scaffold_11 20 scaffold_11_RagTag sorted_dh2.fasta 14838082 14894928 scaffold_11_RagTag sorted_nh1.fasta 13513757 13577354 + + scaffold_11 20 scaffold_11_RagTag sorted_nh1.fasta 13513757 13577354 scaffold_11_RagTag sorted_nh2.fasta 1526952 1586104 + + scaffold_11 21 scaffold_11 dh1_50_scaffolds.fasta 5828277 7982186 scaffold_11_RagTag sorted_dh2.fasta 14905011 17070329 + + scaffold_11 21 scaffold_11_RagTag sorted_dh2.fasta 14905011 17070329 scaffold_11_RagTag sorted_nh1.fasta 15004490 17251407 + + scaffold_11 21 scaffold_11_RagTag sorted_nh1.fasta 15004490 17251407 scaffold_11_RagTag sorted_nh2.fasta 1594698 3827707 + + scaffold_11 22 scaffold_11 dh1_50_scaffolds.fasta 8878988 8893702 scaffold_11_RagTag sorted_dh2.fasta 17849789 17864461 - - scaffold_11 22 scaffold_11_RagTag sorted_dh2.fasta 17849789 17864461 scaffold_19_RagTag sorted_nh1.fasta 9059938 9074616 + - scaffold_11 22 scaffold_19_RagTag sorted_nh1.fasta 9059938 9074616 scaffold_34 sorted_nh2.fasta 216694 237293 + - scaffold_11 23 scaffold_11 dh1_50_scaffolds.fasta 8939931 8970529 scaffold_11_RagTag sorted_dh2.fasta 17772928 17803541 - - scaffold_11
I tried it with only 3, instead of 4, assemblies, and that one worked perfectly fine. I can't figure out what the problem is.
Could it be that the problem is with my assembly (nh2 isn't mapping correctly?)?
Thanks for that!
Ok, yes it looks like a sorting issue - So for the first synteny block/link from your file I see this:
0 scaffold_1 dh1_50_scaffolds.fasta 127771 10241253 scaffold_1_RagTag sorted_dh2.fasta 353864 10513305 + + scaffold_1
0 scaffold_1_RagTag sorted_dh2.fasta 353864 10513305 scaffold_1_RagTag sorted_nh1.fasta 391161 10456794 + + scaffold_1
0 scaffold_1_RagTag sorted_nh1.fasta 391161 10456794 scaffold_1_RagTag sorted_nh2.fasta 481757 10635563 + + scaffold_1
So, it is formatted in such a way that the assemblies are expected to be ordered: dh1_50_scaffolds.fasta, sorted_dh2.fasta, sorted_nh1.fasta, sorted_nh2.fasta
But, it looks like the gtools::mixedorder
that I'm using to sort the input assemblies is finding that slightly different order (ie sorted_nh2.fasta before sorted_nh1.fasta)?
One thing that could be an easy thing to try as a workaround (looks like the genomes are fairly small, so ntSynt should have run fairly quickly?) - could you try renaming one of sorted_nh1.fasta
or sorted_nh2.fasta
to something a bit more different alphabetically? Like sorted_xh2.fa
or something?
I think that your assemblies and the ntSynt outputs are fine - I think it's probably just something wrong in the parsing of the links within the R script.
Hey, it was in fact just an issue order. I re-ran it with a new name and now it looks fine. Thanks a lot!
Excellent! I'm glad that solved it! I'll see if there's anything easy on my end that I could tweak to fix that case, otherwise I'll make a note of that potential issue on the README 👍
Hello! I am trying to compare 4 assemblies. Everything is working great except when I try to plot the links between them, the figure I get is missing the links in the middle. I don't understand why this is happening and would appreciate the feedback. This is the plot it's generating:
And this is the R script I used. I had to load dplyr because the "arrange" function wasn't working and remove the "no_x_axis" function because R couldn't find it.