Closed bradfordcondon closed 5 years ago
Stats are added as properties.
I'd like to get the "date performed" value to put in the chado base table. However, tehre are so many date tags, how would we know which is the irght one?
<AsmReleaseDate_GenBank>2016/06/01 00:00</AsmReleaseDate_GenBank>
<AsmReleaseDate_RefSeq>2017/07/14 00:00</AsmReleaseDate_RefSeq>
<SeqReleaseDate>2016/06/01 00:00</SeqReleaseDate>
<AsmUpdateDate>2017/07/19 00:00</AsmUpdateDate>
<SubmissionDate>2016/06/01 00:00</SubmissionDate>
<LastUpdateDate>2017/07/19 00:00</LastUpdateDate>
SubmissionDate
seems like a good choice. I dont know what happens if we have 2 assemblies, though.
'Date performed' is one of those pieces of metadata that we have never aspired to collect from the user - often because we're importing assemblies, and you don't perform an arthropod genome assembly on a single day. I'm honestly not sure what the 'date performed' is meant for. But, it's required. I think you are correct that SubmissionDate is probably the best alternative.
When would you have 2 assemblies for a given analysis? I guess everything's possible...
<WGS>LVXX01</WGS>
gets added as a dbxref (see #58 )
When would you have 2 assemblies for a given analysis?
I think I was thinking for genbank vs refseq, but youre right, those have their own unique release date tags. so now im not sure what my concern was :)
we'll go with submission date.
looking at https://github.com/NAL-i5K/tripal_eutils/tree/master/examples/assembly for examples.
below are example tags from 751381 that arent dealt with via dbxrefs/linked records
additionally, we have all of the STATS tags.
right now i collect each one combining the category and tag so for example, scaffold_count_all, scaffold_count_placed, etc. would we want ALL of these as properties?