hillerlab / GenomeAlignmentTools

Tools for improving the sensitivity and specificity of genome alignments
MIT License
56 stars 15 forks source link

NetFilterNonNested error #11

Closed rapamycin closed 2 years ago

rapamycin commented 2 years ago

Dear Michael,

I followed your pipelines that published in gagascience (2020, 120way mammal) and aligned a new rodent genome to hg38.

After netClass, my net file looks like the following: net chr1 248956422 fill 11502 17549 Scaffold124 + 86810 13269 id 3869 score 204610 ali 8054 tN 0 qN 18 tR 3301 qR 2443 tTrf 173 qTrf 148 gap 12112 64 Scaffold124 + 87448 992 tN 0 qN 0 tR 0 qR 63 tTrf 0 qTrf 0 fill 12112 52 Scaffold124 - 48726 55 id 59165 score 85 ali 52 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0 gap 12262 26 Scaffold124 + 88510 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0 gap 12733 486 Scaffold124 + 88954 851 tN 0 qN 0 tR 0 qR 479 tTrf 0 qTrf 0 fill 12734 485 Scaffold419 + 65479 905 id 6168 score 4813 ali 452 tN 0 qN 0 tR 0 qR 368 tTrf 0 qTrf 0 gap 13803 25 Scaffold124 + 90351 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0 fill 13803 25 Scaffold704 - 144389 25 id 341141 score 231 ali 25 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0 gap 14933 29 Scaffold124 + 91633 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0 ......

Then I ran NetFilterNonNested.perl with parameters: NetFilterNonNested.perl -doUCSCSynFilter -keepSynNetsWithScore 5000 -keepInvNetsWithScore 5000 human_ssp_all_classed.net >human_ssp_all_classed_filtered.net

I got the following error: ERROR: parameter -keepSynNetsWithScore/-keepInvNetsWithScore-doUCSCSynFilter is given, but I cannot parse the net type from this fill line: fill 11502 17549 Scaffold124 + 86810 13269 id 3869 score 204610 ali 8054 tN 0 qN 18 tR 3301 qR 2443 tTrf 173 qTrf 148

Could you please help me to figure out the possible reasons?

Best Wishes,

Tao

MichaelHiller commented 2 years ago

Dear Tao, I assume something went wrong with netClass. After netClass, your net lines should end with something like "type top" or "type inv" etc.

NetFilterNonNested critically needs this information and the error message indicates that your nets don't provide that.

Maybe try to run netClass again. If that still doesn't classify the net type, then maybe contact UCSC.

Best Michael

rapamycin commented 2 years ago

We solved this problem by running netSyntenic before netClass.

Best wishes!

aaannaw commented 2 years ago

Hello @rapamycin Could you share me with the netClass command? After running netSyntenic, I generated target_genome.net and ref_genome.net. But I am not clear what is the input of netClass? Could you give me any suggestions?

MichaelHiller commented 2 years ago

netClass netClass - Add classification info to net usage: netClass [options] in.net tDb qDb out.net tDb - database to fetch target repeat masker table information qDb - database to fetch query repeat masker table information options: -tNewR=dir - Dir of chrN.out.spec files, with RepeatMasker .out format lines describing lineage specific repeats in target -qNewR=dir - Dir of chrN.out.spec files for query -noAr - Don't look for ancient repeats -qRepeats=table - table name for query repeats in place of rmsk -tRepeats=table - table name for target repeats in place of rmsk

tDb and qDb refer to the reference and query assembly databases that must have the repeat masker SQL table. As far as I know, there is no database free version, although it would be worth asking the UCSC team if that exists. netClass is part of the kent src code.