legumeinfo / datastore-specifications

Specifications for directory naming, file naming, file contents in the LIS datastore
2 stars 0 forks source link

RFC: split /genetic into /gwas and /qtl collections #29

Closed sammyjava closed 2 years ago

sammyjava commented 2 years ago

Currently /genetic contains both GWAS and QTL studies. This was originally motivated by wanting to accommodate publications that contained both GWAS and QTL experiments. But we've moved away from a one-to-one correspondence between publications and collections.

Later, we decided to demark GWAS collections with .gwas. and QTL study collections with .qtl. effectively splitting them apart within /genetic/. That is the current state of the DS: all GWAS are under /genetic/x.gwas.x and all QTL are under /genetic/x.gen.x

Note: This makes our /genetic/ collections inconsistent with the rest of the DS, in that we have both .gwas. and .gen. collections under /genetic/ while all other collection directories have a single .abc. on the collections within them.

Proposal: let's make this clear with separate collection directories and dropping /genetic/:

The GWAS update is trivial, just moving those directories to /gwas.

The QTL update is slightly less trivial, renaming the top directory to /qtl, renaming the collections to .qtl. and renaming all the files to .qtl. But that's why we have scripts which I'll be happy to implement (using git mv so we deal with that aspect of things).

The specified DS file structure under species will then look like:

(ignoring as-yet non-specified directories which appear in various places).

This may make @svengato 's life slightly easier, but perhaps not even noticeable. My QTL and GWAS loaders are easy to separate out of my lis-genetic loader, as they're really independent loads, anyway.

svengato commented 2 years ago

This may make @svengato https://github.com/svengato 's life slightly easier, but perhaps not even noticeable.

Right - the changes will be trivial.

Message ID: @.***>

StevenCannon-USDA commented 2 years ago

I'm on-board with the splitting. I'll note that I'll be mostly unavailable for the next two weeks (Sept 8-22) - again dealing with family stuff in Utah. So I won't be able to help with the rearrangement in that timeframe.

sammyjava commented 2 years ago

I'll implement the rearrangement if we decide to do it. Sounds like that's a GO unless @adf-ncgr or @sdash-github or @ctcncgr have major objections. I said "RFC" not "RFO" expecting none. :)

adf-ncgr commented 2 years ago

no objections here

sdash-github commented 2 years ago

I am with the split into two for convenience.

sammyjava commented 2 years ago

Sounds like a quorum, I'll remove y'all from the assignees and add it to my tasks. Let me know if there's a desired delay for implementation, I'm in no great hurry.

sammyjava commented 2 years ago

This is done.

sdash-github commented 2 years ago

I am not sure why there should be a delay unless one of us is doing some update or something similar. But that is up to the developer.

From PeanutBase point of view, let us do it soon, as 1.)  an annual report is due and 2.) it will help in writing, in the new PB-Jekyll site, a page content that directs to PeanutMine and what to expect there. We are trying the new site to be minimally presentable by Sep-30.

On 2022/9/8 10:59 AM, Sam Hokin wrote:

Sounds like a quorum, I'll remove y'all from the assignees and add it to my tasks. Let me know if there's a desired delay for implementation, I'm in no great hurry.

— Reply to this email directly, view it on GitHub https://github.com/legumeinfo/datastore-specifications/issues/29#issuecomment-1240913691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4A46ZSBJROSRH7XIUTITTV5IENLANCNFSM6AAAAAAQG6LSA4. You are receiving this because you were mentioned.Message ID: @.***>