Closed loukesio closed 2 years ago
thickStart and thickEnd are usulally used to define coding part (CDS), which does not exist for non-coding genes. What do you think would be a correct value in this case?
Dear Juke,
Thank you so much for the prompt reply. I highly appreciate it. In the following picture I would fill thickStart and thickEnd with chromStart and chromEnd? What do you think?
Then an awk command should do the trick e.g.:
awk 'BEGIN{OFS="\t"}{if($7 == "."){$7=$2; $8=$3} print $0 }' file.bed
I am using agat atm to convert a
gff
file tobed
using the following command.When I convert the
gff
tobed
I find NA values (aka.
) in the thickStart and thickEnd column of the bed file for the non-coding RNAs. Is there a way to convertgff
tobed
and acquire a thickStart and thickEnd values for these elements?Thank you for your time
In the link, I post the gff3 file that I am working on https://drive.google.com/drive/folders/1-wmbc9gKtbXFJ95E0n41WgPL-G313SNe?usp=sharing