Closed swarbred closed 2 years ago
In call-SoftMaskGenome
Bug soft and hard masked files are not masked
see bedtools maskfasta -mc 'N' -fi /ei/cb/development/GENANNO-506/reat-dev_prediction_swarbre/cromwell-executions/ei_prediction/061c5659-a884-488f-83b4-79a7562bc598/call-SoftMaskGenome/inputs/-5733854/Calendula_officinalis_EIV1.2.fasta -bed <(gffread --bed $rep_file) -fo Calendula_officinalis_EIV1.2.hardmasked.fa
bedtools maskfasta -mc 'N' -fi /ei/cb/development/GENANNO-506/reat-dev_prediction_swarbre/cromwell-executions/ei_prediction/061c5659-a884-488f-83b4-79a7562bc598/call-SoftMaskGenome/inputs/-5733854/Calendula_officinalis_EIV1.2.fasta -bed <(gffread --bed $rep_file) -fo Calendula_officinalis_EIV1.2.hardmasked.fa
this will not work if the input gff has match features (which looks to be the expectation given what is being parsed in call-PreprocessRepeats
gffread will not use all feature types and even using -O only works for gff3 output
so just need to convert match to exon e.g. that is then the same requirement as for augustus
bedtools maskfasta -mc 'N' -fi Calendula_officinalis_EIV1.2.fasta -bed <(awk 'BEGIN{OFS="\t"} $3=="match" {print $1, "repmask", "exon", $4, $5, $6, $7, $8, $9}' all_interspersed_repeats.gff | gffread --bed) -fo Calendula_officinalis_EIV1.2.softmasked.fa
In call-SoftMaskGenome
Bug soft and hard masked files are not masked
see
bedtools maskfasta -mc 'N' -fi /ei/cb/development/GENANNO-506/reat-dev_prediction_swarbre/cromwell-executions/ei_prediction/061c5659-a884-488f-83b4-79a7562bc598/call-SoftMaskGenome/inputs/-5733854/Calendula_officinalis_EIV1.2.fasta -bed <(gffread --bed $rep_file) -fo Calendula_officinalis_EIV1.2.hardmasked.fa
this will not work if the input gff has match features (which looks to be the expectation given what is being parsed in call-PreprocessRepeats
gffread will not use all feature types and even using -O only works for gff3 output
so just need to convert match to exon e.g. that is then the same requirement as for augustus
bedtools maskfasta -mc 'N' -fi Calendula_officinalis_EIV1.2.fasta -bed <(awk 'BEGIN{OFS="\t"} $3=="match" {print $1, "repmask", "exon", $4, $5, $6, $7, $8, $9}' all_interspersed_repeats.gff | gffread --bed) -fo Calendula_officinalis_EIV1.2.softmasked.fa