Closed alexvasilikop closed 2 years ago
Hi, @alexvasilikop
Thank you very much for the feedback.
There is a note in the vignette pointing to the functions Biostrings::readAAStringSet()
(to read FASTA files) and rtracklayer::import()
(to read GFF/GTF files).
However, if you haven't noticed it, it probably should not be a simple note.
I will write a whole section explaining how to import FASTA and GFF/GTF files to the R session. Give me some minutes.
Best, Fabricio
Hi, @alexvasilikop
I wrote a section on how to load FASTA files as a list of AAStringSet
objects and GFF/GTF files as a GRangesList
.
Check out the new documentation website: https://almeidasilvaf.github.io/syntenet/
Best, Fabricio
Hi Sorry to bother on a closed issue; I'm really eager to run a mycrosynteny based phylogeny. But I was wondering if you could guide me how to import the fasta and gff files corresponding for analysis. I 've been having isssues trying the formats .pep and .bed i got no success. I see there's a way using Biostrings::readAAStringSet() (to read FASTA files) and rtracklayer::import(), but I'm a little bit lost, I do no quite posses much expertirse.
Hi, @jhcuarta
After this issue was opened, I updated the documentation with an entire section on how to load data from FASTA and GFF files (see here).
Besides, note that:
speciesA.fa
, or even speciesA.pep.fa
)Best, Fabricio
Hi I was wondering if you could help me out since my data didn't pass the check_input, I'm confused since both files were obtained using Prokka 1.14.6, names for protein and headers must match, isn't it. I'll provide two links with my data files so you can give me a hand
https://drive.google.com/file/d/1RT_IKKFsnGGTS0E_SBkdPeWu1E3GVtVq/view?usp=sharing https://drive.google.com/file/d/11zkv1m2fEZA7BjQalRlWnAqdWEyaeC2-/view?usp=sharing
Best regards and thanks ahead
Hi I had reedited the fasta sequences headers so the names would match but didn't pass the check_input filter, could you please take a look at the files so I can know how to reedit the files in order to match. I'm bewildered since both files were obtained from the same application Prokka 1.14.6 and coding must be preserved through files output. Could you please help me out, and check out the files, I'm really eager to use your development. Thanks ahead.
Hi.
Please, do not ask for help in someone else's closed issue. It's part of the etiquette of asking for help online.
If you have a problem, you have 2 options:
One cannot simply include links to hundreds of Mb and ask others to download the file and inspect whatever error there is.
Thanks for understanding, Fabricio
hello,
Thanks for developing such useful tools. I could not find somewhere the proper function to import the data (proteomes and annotations) to syntenet or the correct format.
I have modified the data as described here https://github.com/zhaotao1987/SynNet-Pipeline/wiki/Genome-Preparation. Is this the correct format?
Which function should I use? The documentation describes an example using already existing data so no parsing..
Many thanks Alex