Closed kcl58759 closed 4 months ago
Hi, I had a lot of problems parsing and ended up making my own bed and peptide files - just put them in the same dir as you run genespace and make sure names of genes are identical in bed and fasta files (fastas must be called .fa). You can make a bed by cut -f1,3,4,8 your.gff > your.bed
. You might have to change the cut command depending on gtf/gff/gff3 structure. Also I could not parse the bed if it had 'cds', 'exon' etc. I had to grep -P "\tgene\t" file.bed > genes.bed
to only get the genes. I think this is sufficient for large scale synteny investigations.
Hope that helps.
Hi, I had a lot of problems parsing and ended up making my own bed and peptide files - just put them in the same dir as you run genespace and make sure names of genes are identical in bed and fasta files (fastas must be called .fa). You can make a bed by
cut -f1,3,4,8 your.gff > your.bed
. You might have to change the cut command depending on gtf/gff/gff3 structure. Also I could not parse the bed if it had 'cds', 'exon' etc. I had togrep -P "\tgene\t" file.bed > genes.bed
to only get the genes. I think this is sufficient for large scale synteny investigations.Hope that helps.
Thanks so much!
Hi there! I am a new user so I may be missing something obvious but I am having trouble with parsing a gff3 file. Here is my code:
parse_annotations(rawGenomeRepo=genomeRepo, genomeDirs="E_festucae", genomeIDs = "E_festucae7", gffString = "gff3", faString = "faa", genespaceWd=wd, troubleShoot = TRUE, headerEntryIndex = 1, overwrite = F, headerSep=" ", gffIdColumn = "ID")
Here is the parsing error:
I am unsure why it will parse at first and then is empty. Any help is much appreciated!