idanefroni / Conservatory

Identification of conserved non-coding sequences in plants
GNU General Public License v3.0
12 stars 3 forks source link

Example genome_database.csv #2

Closed eggrandio closed 1 year ago

eggrandio commented 1 year ago

Hello,

I am trying to use Conservatory with the genomes used in Hendelman et al., but the example genome_database.csv file is missing from the git repo (in the instructions, it is stated that: "The genome specification for all genomes used in Hendelman et al. are already in the file.").

I was also wondering if there is any chance of obtaining all the genomes and gff files used in the paper from a repository (Zenodo?) or if I should manually download all of them by myself.

Thanks!

idanefroni commented 1 year ago

Dear Eduardo,

The conservatory used in Hendelman et al. is in the Conservatory 1.5 branch, where you would find the genome_database.csv file. We are currently upgrading to a new version on the master branch, but this is still in development and unstable. I now see that the default branch was incorrect and fixed that.

Regarding the genomes, you can download them yourself, but I can definitely save you the time and share the directory. Only issue is that it is quite large (~100Gb) and beyond zenodo size limit. I will try and upload to dryad and will send you a link.

Idan

On 26/10/2022 14:49:05, Eduardo Gonzalez Grandio @.> wrote: Hello, I am trying to use Conservatory with the genomes used in Hendelman et al., but the example genome_database.csv file is missing from the git repo (in the instructions, it is stated that: "The genome specification for all genomes used in Hendelman et al. are already in the file."). I was also wondering if there is any chance of obtaining all the genomes and gff files used in the paper from a repository (Zenodo?) or if I should manually download all of them by myself. Thanks! — Reply to this email directly, view it on GitHub [https://github.com/idanefroni/Conservatory/issues/2], or unsubscribe [https://github.com/notifications/unsubscribe-auth/AK6XGGFCU74O2H5ZOH7LJMLWFELDBANCNFSM6AAAAAARO5D7UM]. You are receiving this because you are subscribed to this thread.Message ID: @.>

eggrandio commented 1 year ago

Dear Idan,

Thanks for your quick reply. I have started downloading the genomes, but it would definitely save time if you could provide a dryad link, as formatting of annotations and protein fasta files seems to be genome-specific and can take a while to reformat.

We are mainly interested in Solanaceae genomes (and maybe brassicaceae) though.

Best,