oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
341 stars 73 forks source link

Exclude bed format region not working! #300

Open bijendrabio opened 2 years ago

bijendrabio commented 2 years ago

Hello, I ran the EDTA with --exclude option (with --anno 1) to exclude the bed format regions from the TE annotation. However, the output file seems to contain the predicted TEs in the bed format regions. Curious, why this might be the case. Kindly advise!

Regards, B

bijendrabio commented 1 year ago

@oushujun

oushujun commented 1 year ago

Hello @bijendrabio,

I am so sorry for the delay in getting back to this issue. The --exclude 1 parameter is designed to avoid masking user-specified regions in the MAKER.masked output. This is useful if you already have gene annotations or have a list of regions you don't want to be masked in the low-threshold masking output (if you don't know what this means, please check out the wiki). Sorry if this was not clear in the user manual.

On the other hand, if you want to remove TE annotations that overlap with your list of target regions, you may simply use bedtools intersect. For example: bedtools intersect -v -a genome.fa.mod.EDTA.TEanno.gff3 -b genome.exclude.bed > genome.fa.mod.EDTA.TEanno.clean.gff3

Please let me know if you have other questions.

Thanks, Shujun