Open gtsueng opened 2 months ago
As seen in https://github.com/NIAID-Data-Ecosystem/niaid-feedback/issues/127, this issue has been approved for work start
@jal347 I've assigned this issue to you. Please let me know if you have any questions, after you've had a look at it.
If you need an example, consider AmoebaDB https://amoebadb.org/amoeba/app. Each record in AmoebaDB should already be in the NIAID Data ecosystem because each record is also in VEuPathDB (which is already included). What we want to do is identify all the records in AmoebaDB and add to each record:
"includedInDataCatalog" : { "@type": "DataCatalog", "name": "AmoebaDB",
"url": "https://amoebadb.org/amoeba/app/record/dataset/identifier", "versionDate": "YYYY-MM-DD" }
@hartwickma, @lisa-mml, @rshabman, @sudvenk
As mentioned at the bi-weekly meeting dated 2024.06.11, the VEuPathDB collections are now available on staging and can be accessed by going to the staging site and going to the 'source' filter:
If you have any feedback or suggestions on how collections like the ones coming from VEuPathDB should be displayed, please provide them to the collections issue here: https://github.com/NIAID-Data-Ecosystem/niaid-feedback/issues/136
Per the discussion at the bi-weekly meeting dated 2024.06.25, the reason why VEuPathDB on Staging has not yet been moved to Production is because of the metadata changes caused by the merging of VEuPathDB collections data.
@DylanWelzel @jal347 is it possible to update VEuPathDB on Production without the data merged in from VEuPathDB collections? If so, please proceed to do so. If not, then we will wait until VEuPathDB collections are approved before proceeding.
Hi @gtsueng, thanks for following up on this item to update the VEuPathDB to the most recent release. NIAID approves the VEuPathDB collections for production, and this will hopefully remove the metadata merge complication.
There is also an issue with the current display of the the VEuPathDB in the filter function that needs to be updated as well. The filter options under 'Resources' needs to reflect that these VEuPathDB collections are part of VEuPathDB. Please:
We also discussed a related item in the meeting yesterday about the new sub headers in the filter section, where new terms (Other Resources, Basic science Repositories). Please remove these terms and return to the previously agreed upon Domains of 'IID' and 'Generalist Repositories' while discussions continue about how best to make domain assignments and assign terms.
Hi @hartwickma,
@DylanWelzel and I discussed this earlier and he confirmed that it would be possible to update VEuPathDB on Production without pushing VEupathDB collections. We will proceed to update VEuPathDB on Production without VEuPathDB collections as the following requests may require additional changes on the back and front ends:
- Move the VEuPathDB collections from where they are listed under 'Other Resources' so that they appear with VEuPathDB under 'IID Resources'
- We would like the appearance of these VEuPathDB collections to indicate that they are part of the larger VEuPathDB repository, so please consider how best that this can be done (eg: indented, bullets, etc) in the filter list.
We are marking this issue as 'in progress -- refinement' to reflect the changes needed to
Source Name
VEuPathDB collections
Source URL
see description
Source Description
VEuPathDB hosts numerous collections which are treated as sources in their own right. This includes GiardiaDB, CryptoDB, etc.
These collections are as follows: AmoebaDB | https://amoebadb.org/amoeba/app CryptoDB | https://cryptodb.org/cryptodb/app GiardiaDB | https://giardiadb.org/giardiadb/app HostDB | https://hostdb.org/hostdb/app PlasmoDB | https://plasmodb.org/plasmo/app VectorBase | https://vectorbase.org/vectorbase/app FungiDB | https://fungidb.org/fungidb/app MicrosporidiaDB | https://microsporidiadb.org/micro/app ToxoDB | https://toxodb.org/toxo/app TrichDB | https://trichdb.org/trichdb/app TriTrypDB | https://tritrypdb.org/tritrypdb/app PiroplasmaDB | https://piroplasmadb.org/piro/app
The URL structure for a record in these databases are similar to that of VEuPathDB: Structure:
https://{base_url}/record/dataset/{identifier}
Example:https://amoebadb.org/amoeba/app/record/dataset/DS_63733e001b
Identical record in VEuPathDB:https://veupathdb.org/veupathdb/app/record/dataset/DS_63733e001b
As seen above, the record ID is identical between the resources.
Desired outcome: We would like to add an
includedInDataCatalog
value for each of the records hosted in these "databases". Currently, they are likely to be ingested via VEuPathDB and have anincludedInDataCatalog.name
value of veupathdb. We would like it to have theincludedInDataCatalog
values of [{name: veupathdb, etc.},{name:otherdb, etc}] for whichever DB the record is also a part of.To do:
Caveats:
name
AND NOdescription
. If this increases significantly in the process, this suggests that either VEuPathDB is out-of-date relative to the collection database, OR that the collection database contains records NOT available in VEuPathDB. Measures to address these records should be takenSource Access
No access issue, account not needed
Source Funding
NIAID
Source Relevance
NIAID-funded
Related WBS task
For internal use only. Assignee, please select the status of this issue
Status Description
Please hold on starting this issue until the NIAID team confirms the desired outcome. See https://github.com/NIAID-Data-Ecosystem/niaid-feedback/issues/127
Source to-do list