Open emmafg opened 3 months ago
I recommend running BRAKER on the complete genome. You can extract the scaffold-specific predictions, afterwards. For example, to get the predictions of scaffold_12:
grep -P '^scaffold_12\t' braker.gtf | grep -P '\ttranscript\t' | cut -f 9 > scaff12_tx.lst # get all the transcript names on the scaffold
cdbfasta braker.codingseq -o braker.codingseq.idx # index the codingseq file
cat scaff12_tx.lst | cdbyank braker.codingseq.idx -d braker.codingseq > scaff12_tx.codingseq # extract the transcripts from codingseq file
Typos are possible in these commands, I am drafting them without testing.
On Tue, Apr 2, 2024 at 12:04 PM emmafg @.***> wrote:
Hello, I am new to gene prediction. As part of my research I have to predict the number of genes on some of my scaffolds. For one of my scaffolds (scaffold 10) its size is sufficient for Braker3 to run on it and the output is very good, from braker.codingseq I obtain my predicted number of genes using "grep-c"'>" braker.codingseq". However, for another much smaller scaffold (scaffold_12) I had to couple it with another (scaffold 10) so that its size was sufficient to brake. Except that in my outputs I don't know how to determine the number of genes predicted only on scaffold 12. Is there a way to recover the predicted genes based on the scaffolds?
Thank you for your help ! Emma
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/794, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JFKJ3M3NHP4GDETST3Y3J7ELAVCNFSM6AAAAABFTCGKSKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZDAMBYHA2TOMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hello, I am new to gene prediction. As part of my research I have to predict the number of genes on some of my scaffolds. For one of my scaffolds (scaffold 10) its size is sufficient for Braker3 to run on it and the output is very good, from braker.codingseq I obtain my predicted number of genes using "grep-c"'>" braker.codingseq". However, for another much smaller scaffold (scaffold_12) I had to couple it with another (scaffold 10) so that its size was sufficient to brake. Except that in my outputs I don't know how to determine the number of genes predicted only on scaffold 12. Is there a way to recover the predicted genes based on the scaffolds?
Thank you for your help ! Emma