refgenie / plantref

Refgenieserver content repository for plant genomes server
http://plantref.databio.org
0 stars 1 forks source link

update all PLAZA entries #2

Closed ieguinoa closed 4 years ago

ieguinoa commented 4 years ago

Hi,

sorry I missed the notification from the issue when the repo was created.

Here is a quick extract I made with some more metadata, although I'm not familiar with the PEP format. Just checked quickly the pipeline_interfaces/ dir and couldn't easily see how is the link made between the entries in both assets and recipe_inputs tables. The way I'm listing them now is what I assumed you were following: samplename = genome + '-fasta' is that correct? I've edited the way the genome ids are printed so it won't create ids with weird things like '-' Also for the description I'm not sure if some characters are forbidden or crash when used in refgenie, I can change that if needed. And of course I'm missing the checksums, need to get that from the ftp server dirs.

Ignacio

nsheff commented 4 years ago

Wow, I'm sorry, I didn't get a notification for this issue! I've been wondering what's going on. I somehow wasn't listed on 'watchers' -- now I've fixed that. I will take a look at this soon!

nsheff commented 4 years ago

yeah this looks correct on the surface, I will give a try.

nsheff commented 4 years ago

Hi @ieguinoa I'm processing these and just noticed you have 1 genome with a slash in the name:

Asterochloris_sp__Cgr/DA1pho_v2_0_JGI_7_45_13

this isn't going to work; a slash isn't going to be an allowable character for a genome name for refgenie, because we use slashes to indicate genome/asset registry paths (e.g. hg38/bowtie2_index). Can that name be changed?