Closed mbthornton-lbl closed 3 months ago
linkml-validate: No issues found
linkml-convert: successful
Imported to GraphDB
SPARQL query:
PREFIX nmdc: <https://w3id.org/nmdc/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
# Orphan DataObjects - not object of has_input or has_output
select * where {
?do a nmdc:DataObject .
minus {
?o nmdc:has_input ?do .
}
minus {
?o nmdc:has_output ?do .
}
} limit 100
Returns to results
Schema v10 compatibility issues will be addressed by: https://github.com/microbiomedata/nmdc_automation/issues/66
@mbthornton-lbl should this re-opened issue get moved to the next sprint?
Re-ID logic is not deleting multiple versions, please rerun. For example there are still legacy data objects for gold:Gp0321263 even though there is a properly re-ided record. {'description':{$regex:/Gp0321263/}} on data_object_set returns 14 records.
This is the study where the NOM omics records still need to be deleted by Yuri.
'omics_processing_set' ).aggregate( [ { $match: { 'omics_type.has_raw_value': 'Organic Matter Characterization' } }, { $lookup: { from: 'data_object_set', localField: 'has_output', foreignField: 'id', as: 'output_do' } }, { $match: { output_do: { $exists: true, $size: 0 } } }, { $group: { _id: '$part_of', count: { $sum: 1 } } } ], { maxTimeMS: 60000, allowDiskUse: true } );
Re-ranomics_processing_has_output_data_objects
: 0 results
Resolved by #1894
Note: Scope of this work is the
Napa
Database Instance. The same steps will need to be repeated in a prod-ready environmentFor the "CrestedButte" Study - id: nmdc:sty-11-dcqce727 legacy id: gold:Gs0135149 jgi proposal id: 503568
linkml-validate