VirtualFlyBrain / vfb-pipeline-config

Pipeline 2.0 Configuration
Apache License 2.0
0 stars 1 forks source link

Analyse count discrepancies across KB, PDB and PDB2 #21

Closed matentzn closed 4 years ago

matentzn commented 4 years ago
matentzn commented 4 years ago

Ok @dosumis @Robbie1977 I need some help with this:

Class counts:

Any ideas where I should be looking? Is it possible that the 50K classes in the triple store is correct, given that the slicing in the collect data pipeline may be more rigorous now the way we do it? Same with individual counts

dosumis commented 4 years ago

I wouldn't worry about comparing class numbers

matentzn commented 4 years ago

At least I worry TS->PDB2

matentzn commented 4 years ago

TS discrepancy was mostly because the counting query took into account anonymous expression... (blank nodes)

matentzn commented 4 years ago

Ok we will close this, as most of the discrepancies have been accounted for. Some of the larger ones are due to Flybase side loading - expression patterns and the likes.