ispyb / ispyb-database-modeling

4 stars 3 forks source link

Cleaning up Screening Tables #46

Open antolinos opened 4 years ago

antolinos commented 4 years ago

The goal of this issue is to check if some columns or tables could be simplified by, for instance, removing unused or deprecated columns.

This is the current schema:

Screenshot from 2019-09-11 15:02:11

antolinos commented 4 years ago

By running this query:

SELECT 
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where diffractionPlanId is not NULL) as "Screening.diffractionPlanId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where dataCollectionGroupId is not NULL) as "Screening.dataCollectionGroupId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where dataCollectionId is not NULL) as "Screening.dataCollectionId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where bltimeStamp is not NULL) as "Screening.bltimeStamp",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where programVersion is not NULL) as "Screening.programVersion",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where comments is not NULL) as "Screening.comments",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where shortComments is not NULL) as "Screening.shortComments",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from Screening where xmlSampleInformation is not NULL) as "Screening.xmlSampleInformation",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where screeningOutputId is not NULL) as "ScreeningOutput.screeningOutputId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where screeningId is not NULL) as "ScreeningOutput.screeningId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where statusDescription is not NULL) as "ScreeningOutput.statusDescription",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where rejectedReflections is not NULL) as "ScreeningOutput.rejectedReflections",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where resolutionObtained is not NULL) as "ScreeningOutput.resolutionObtained",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where spotDeviationR is not NULL) as "ScreeningOutput.spotDeviationR",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where spotDeviationTheta is not NULL) as "ScreeningOutput.spotDeviationTheta",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where beamShiftX is not NULL) as "ScreeningOutput.beamShiftX",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where beamShiftY is not NULL) as "ScreeningOutput.beamShiftY",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where numSpotsFound is not NULL) as "ScreeningOutput.numSpotsFound",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where numSpotsUsed is not NULL) as "ScreeningOutput.numSpotsUsed",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where numSpotsRejected is not NULL) as "ScreeningOutput.numSpotsRejected",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where mosaicity is not NULL) as "ScreeningOutput.mosaicity",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where iOverSigma is not NULL) as "ScreeningOutput.iOverSigma",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where diffractionRings is not NULL) as "ScreeningOutput.diffractionRings",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where strategySuccess is not NULL) as "ScreeningOutput.strategySuccess",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where mosaicityEstimated is not NULL) as "ScreeningOutput.mosaicityEstimated",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where rankingResolution is not NULL) as "ScreeningOutput.rankingResolution",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where program is not NULL) as "ScreeningOutput.program",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where doseTotal is not NULL) as "ScreeningOutput.doseTotal",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where totalExposureTime is not NULL) as "ScreeningOutput.totalExposureTime",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where totalRotationRange is not NULL) as "ScreeningOutput.totalRotationRange",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where totalNumberOfImages is not NULL) as "ScreeningOutput.totalNumberOfImages",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where rFriedel is not NULL) as "ScreeningOutput.rFriedel",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where indexingSuccess is not NULL) as "ScreeningOutput.indexingSuccess",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutput where screeningSuccess is not NULL) as "ScreeningOutput.screeningSuccess",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRank where screeningRankId is not NULL) as "ScreeningRank.screeningRankId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRank where screeningRankSetId is not NULL) as "ScreeningRank.screeningRankSetId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRank where screeningId is not NULL) as "ScreeningRank.screeningId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRank where rankValue is not NULL) as "ScreeningRank.rankValue",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRank where rankInformation is not NULL) as "ScreeningRank.rankInformation",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRankSet where screeningRankSetId is not NULL) as "ScreeningRankSet.screeningRankSetId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRankSet where rankEngine is not NULL) as "ScreeningRankSet.rankEngine",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRankSet where rankingProjectFileName is not NULL) as "ScreeningRankSet.rankingProjectFileName",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningRankSet where rankingSummaryFileName is not NULL) as "ScreeningRankSet.rankingSummaryFileName",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where screeningOutputLatticeId is not NULL) as "ScreeningOutputLattice.screeningOutputLatticeId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where screeningOutputId is not NULL) as "ScreeningOutputLattice.screeningOutputId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where spaceGroup is not NULL) as "ScreeningOutputLattice.spaceGroup",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where pointGroup is not NULL) as "ScreeningOutputLattice.pointGroup",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where bravaisLattice is not NULL) as "ScreeningOutputLattice.bravaisLattice",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_a_x is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_x",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_a_y is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_y",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_a_z is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_z",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_b_x is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_x",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_b_y is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_y",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_b_z is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_z",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_c_x is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_x",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_c_y is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_y",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where rawOrientationMatrix_c_z is not NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_z",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_a is not NULL) as "ScreeningOutputLattice.unitCell_a",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_b is not NULL) as "ScreeningOutputLattice.unitCell_b",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_c is not NULL) as "ScreeningOutputLattice.unitCell_c",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_alpha is not NULL) as "ScreeningOutputLattice.unitCell_alpha",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_beta is not NULL) as "ScreeningOutputLattice.unitCell_beta",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where unitCell_gamma is not NULL) as "ScreeningOutputLattice.unitCell_gamma",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where bltimeStamp is not NULL) as "ScreeningOutputLattice.bltimeStamp",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningOutputLattice where labelitIndexing is not NULL) as "ScreeningOutputLattice.labelitIndexing",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where screeningStrategyId is not NULL) as "ScreeningStrategy.screeningStrategyId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where screeningOutputId is not NULL) as "ScreeningStrategy.screeningOutputId",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where phiStart is not NULL) as "ScreeningStrategy.phiStart",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where phiEnd is not NULL) as "ScreeningStrategy.phiEnd",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where rotation is not NULL) as "ScreeningStrategy.rotation",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where exposureTime is not NULL) as "ScreeningStrategy.exposureTime",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where resolution is not NULL) as "ScreeningStrategy.resolution",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where completeness is not NULL) as "ScreeningStrategy.completeness",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where multiplicity is not NULL) as "ScreeningStrategy.multiplicity",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where anomalous is not NULL) as "ScreeningStrategy.anomalous",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where program is not NULL) as "ScreeningStrategy.program",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where rankingResolution is not NULL) as "ScreeningStrategy.rankingResolution",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategy where transmission is not NULL) as "ScreeningStrategy.transmission",
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where screeningStrategySubWedgeId is not NULL) as "ScreeningStrategySubWedge.screeningStrategySubWedgeId" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where screeningStrategyWedgeId is not NULL) as "ScreeningStrategySubWedge.screeningStrategyWedgeId" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where subWedgeNumber is not NULL) as "ScreeningStrategySubWedge.subWedgeNumber" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where rotationAxis is not NULL) as "ScreeningStrategySubWedge.rotationAxis" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where axisStart is not NULL) as "ScreeningStrategySubWedge.axisStart" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where axisEnd is not NULL) as "ScreeningStrategySubWedge.axisEnd" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where exposureTime is not NULL) as "ScreeningStrategySubWedge.exposureTime" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where transmission is not NULL) as "ScreeningStrategySubWedge.transmission" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where oscillationRange is not NULL) as "ScreeningStrategySubWedge.oscillationRange" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where completeness is not NULL) as "ScreeningStrategySubWedge.completeness" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where multiplicity is not NULL) as "ScreeningStrategySubWedge.multiplicity" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where doseTotal is not NULL) as "ScreeningStrategySubWedge.doseTotal" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where numberOfImages is not NULL) as "ScreeningStrategySubWedge.numberOfImages" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where comments is not NULL) as "ScreeningStrategySubWedge.comments" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategySubWedge where resolution is not NULL) as "ScreeningStrategySubWedge.resolution" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where screeningStrategyWedgeId is not NULL) as "ScreeningStrategyWedge.screeningStrategyWedgeId" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where screeningStrategyId is not NULL) as "ScreeningStrategyWedge.screeningStrategyId" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where wedgeNumber is not NULL) as "ScreeningStrategyWedge.wedgeNumber" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where resolution is not NULL) as "ScreeningStrategyWedge.resolution" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where completeness is not NULL) as "ScreeningStrategyWedge.completeness" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where multiplicity is not NULL) as "ScreeningStrategyWedge.multiplicity" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where doseTotal is not NULL) as "ScreeningStrategyWedge.doseTotal" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where numberOfImages is not NULL) as "ScreeningStrategyWedge.numberOfImages" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where phi is not NULL) as "ScreeningStrategyWedge.phi" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where kappa is not NULL) as "ScreeningStrategyWedge.kappa" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where chi is not NULL) as "ScreeningStrategyWedge.chi" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where comments is not NULL) as "ScreeningStrategyWedge.comments" ,
(select case(count(*)) when 0 then "NULL" else "NOT NULL" end from ScreeningStrategyWedge where wavelength is not NULL) as "ScreeningStrategyWedge.wavelength" 
from Screening limit 1;

We can find the columns that are null

antolinos commented 4 years ago

Sometimes for testing, colums are populated however they are not when software is deployed in production. By having the count of null values and total count we can find out which columns are populated only for testing purposes

select 
(select count(*) from Screening where screeningId is NULL) as "Screening.screeningId" ,
(select count(*) from Screening where diffractionPlanId is NULL) as "Screening.diffractionPlanId" ,
(select count(*) from Screening where dataCollectionGroupId is NULL) as "Screening.dataCollectionGroupId" ,
(select count(*) from Screening where dataCollectionId is NULL) as "Screening.dataCollectionId" ,
(select count(*) from Screening where bltimeStamp is NULL) as "Screening.bltimeStamp" ,
(select count(*) from Screening where programVersion is NULL) as "Screening.programVersion" ,
(select count(*) from Screening where comments is NULL) as "Screening.comments" ,
(select count(*) from Screening where shortComments is NULL) as "Screening.shortComments" ,
(select count(*) from Screening where xmlSampleInformation is NULL) as "Screening.xmlSampleInformation" ,
(select count(*) from ScreeningOutput where screeningOutputId is NULL) as "ScreeningOutput.screeningOutputId" ,
(select count(*) from ScreeningOutput where screeningId is NULL) as "ScreeningOutput.screeningId" ,
(select count(*) from ScreeningOutput where statusDescription is NULL) as "ScreeningOutput.statusDescription" ,
(select count(*) from ScreeningOutput where rejectedReflections is NULL) as "ScreeningOutput.rejectedReflections" ,
(select count(*) from ScreeningOutput where resolutionObtained is NULL) as "ScreeningOutput.resolutionObtained" ,
(select count(*) from ScreeningOutput where spotDeviationR is NULL) as "ScreeningOutput.spotDeviationR" ,
(select count(*) from ScreeningOutput where spotDeviationTheta is NULL) as "ScreeningOutput.spotDeviationTheta" ,
(select count(*) from ScreeningOutput where beamShiftX is NULL) as "ScreeningOutput.beamShiftX" ,
(select count(*) from ScreeningOutput where beamShiftY is NULL) as "ScreeningOutput.beamShiftY" ,
(select count(*) from ScreeningOutput where numSpotsFound is NULL) as "ScreeningOutput.numSpotsFound" ,
(select count(*) from ScreeningOutput where numSpotsUsed is NULL) as "ScreeningOutput.numSpotsUsed" ,
(select count(*) from ScreeningOutput where numSpotsRejected is NULL) as "ScreeningOutput.numSpotsRejected" ,
(select count(*) from ScreeningOutput where mosaicity is NULL) as "ScreeningOutput.mosaicity" ,
(select count(*) from ScreeningOutput where iOverSigma is NULL) as "ScreeningOutput.iOverSigma" ,
(select count(*) from ScreeningOutput where diffractionRings is NULL) as "ScreeningOutput.diffractionRings" ,
(select count(*) from ScreeningOutput where strategySuccess is NULL) as "ScreeningOutput.strategySuccess" ,
(select count(*) from ScreeningOutput where mosaicityEstimated is NULL) as "ScreeningOutput.mosaicityEstimated" ,
(select count(*) from ScreeningOutput where rankingResolution is NULL) as "ScreeningOutput.rankingResolution" ,
(select count(*) from ScreeningOutput where program is NULL) as "ScreeningOutput.program" ,
(select count(*) from ScreeningOutput where doseTotal is NULL) as "ScreeningOutput.doseTotal" ,
(select count(*) from ScreeningOutput where totalExposureTime is NULL) as "ScreeningOutput.totalExposureTime" ,
(select count(*) from ScreeningOutput where totalRotationRange is NULL) as "ScreeningOutput.totalRotationRange" ,
(select count(*) from ScreeningOutput where totalNumberOfImages is NULL) as "ScreeningOutput.totalNumberOfImages" ,
(select count(*) from ScreeningOutput where rFriedel is NULL) as "ScreeningOutput.rFriedel" ,
(select count(*) from ScreeningOutput where indexingSuccess is NULL) as "ScreeningOutput.indexingSuccess" ,
(select count(*) from ScreeningOutput where screeningSuccess is NULL) as "ScreeningOutput.screeningSuccess" ,
(select count(*) from ScreeningOutputLattice where screeningOutputLatticeId is NULL) as "ScreeningOutputLattice.screeningOutputLatticeId" ,
(select count(*) from ScreeningOutputLattice where screeningOutputId is NULL) as "ScreeningOutputLattice.screeningOutputId" ,
(select count(*) from ScreeningOutputLattice where spaceGroup is NULL) as "ScreeningOutputLattice.spaceGroup" ,
(select count(*) from ScreeningOutputLattice where pointGroup is NULL) as "ScreeningOutputLattice.pointGroup" ,
(select count(*) from ScreeningOutputLattice where bravaisLattice is NULL) as "ScreeningOutputLattice.bravaisLattice" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_a_x is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_x" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_a_y is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_y" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_a_z is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_a_z" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_b_x is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_x" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_b_y is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_y" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_b_z is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_b_z" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_c_x is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_x" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_c_y is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_y" ,
(select count(*) from ScreeningOutputLattice where rawOrientationMatrix_c_z is NULL) as "ScreeningOutputLattice.rawOrientationMatrix_c_z" ,
(select count(*) from ScreeningOutputLattice where unitCell_a is NULL) as "ScreeningOutputLattice.unitCell_a" ,
(select count(*) from ScreeningOutputLattice where unitCell_b is NULL) as "ScreeningOutputLattice.unitCell_b" ,
(select count(*) from ScreeningOutputLattice where unitCell_c is NULL) as "ScreeningOutputLattice.unitCell_c" ,
(select count(*) from ScreeningOutputLattice where unitCell_alpha is NULL) as "ScreeningOutputLattice.unitCell_alpha" ,
(select count(*) from ScreeningOutputLattice where unitCell_beta is NULL) as "ScreeningOutputLattice.unitCell_beta" ,
(select count(*) from ScreeningOutputLattice where unitCell_gamma is NULL) as "ScreeningOutputLattice.unitCell_gamma" ,
(select count(*) from ScreeningOutputLattice where bltimeStamp is NULL) as "ScreeningOutputLattice.bltimeStamp" ,
(select count(*) from ScreeningOutputLattice where labelitIndexing is NULL) as "ScreeningOutputLattice.labelitIndexing" ,
(select count(*) from ScreeningRank where screeningRankId is NULL) as "ScreeningRank.screeningRankId" ,
(select count(*) from ScreeningRank where screeningRankSetId is NULL) as "ScreeningRank.screeningRankSetId" ,
(select count(*) from ScreeningRank where screeningId is NULL) as "ScreeningRank.screeningId" ,
(select count(*) from ScreeningRank where rankValue is NULL) as "ScreeningRank.rankValue" ,
(select count(*) from ScreeningRank where rankInformation is NULL) as "ScreeningRank.rankInformation" ,
(select count(*) from ScreeningRankSet where screeningRankSetId is NULL) as "ScreeningRankSet.screeningRankSetId" ,
(select count(*) from ScreeningRankSet where rankEngine is NULL) as "ScreeningRankSet.rankEngine" ,
(select count(*) from ScreeningRankSet where rankingProjectFileName is NULL) as "ScreeningRankSet.rankingProjectFileName" ,
(select count(*) from ScreeningRankSet where rankingSummaryFileName is NULL) as "ScreeningRankSet.rankingSummaryFileName" ,
(select count(*) from ScreeningStrategy where screeningStrategyId is NULL) as "ScreeningStrategy.screeningStrategyId" ,
(select count(*) from ScreeningStrategy where screeningOutputId is NULL) as "ScreeningStrategy.screeningOutputId" ,
(select count(*) from ScreeningStrategy where phiStart is NULL) as "ScreeningStrategy.phiStart" ,
(select count(*) from ScreeningStrategy where phiEnd is NULL) as "ScreeningStrategy.phiEnd" ,
(select count(*) from ScreeningStrategy where rotation is NULL) as "ScreeningStrategy.rotation" ,
(select count(*) from ScreeningStrategy where exposureTime is NULL) as "ScreeningStrategy.exposureTime" ,
(select count(*) from ScreeningStrategy where resolution is NULL) as "ScreeningStrategy.resolution" ,
(select count(*) from ScreeningStrategy where completeness is NULL) as "ScreeningStrategy.completeness" ,
(select count(*) from ScreeningStrategy where multiplicity is NULL) as "ScreeningStrategy.multiplicity" ,
(select count(*) from ScreeningStrategy where anomalous is NULL) as "ScreeningStrategy.anomalous" ,
(select count(*) from ScreeningStrategy where program is NULL) as "ScreeningStrategy.program" ,
(select count(*) from ScreeningStrategy where rankingResolution is NULL) as "ScreeningStrategy.rankingResolution" ,
(select count(*) from ScreeningStrategy where transmission is NULL) as "ScreeningStrategy.transmission" ,
(select count(*) from ScreeningStrategySubWedge where screeningStrategySubWedgeId is NULL) as "ScreeningStrategySubWedge.screeningStrategySubWedgeId" ,
(select count(*) from ScreeningStrategySubWedge where screeningStrategyWedgeId is NULL) as "ScreeningStrategySubWedge.screeningStrategyWedgeId" ,
(select count(*) from ScreeningStrategySubWedge where subWedgeNumber is NULL) as "ScreeningStrategySubWedge.subWedgeNumber" ,
(select count(*) from ScreeningStrategySubWedge where rotationAxis is NULL) as "ScreeningStrategySubWedge.rotationAxis" ,
(select count(*) from ScreeningStrategySubWedge where axisStart is NULL) as "ScreeningStrategySubWedge.axisStart" ,
(select count(*) from ScreeningStrategySubWedge where axisEnd is NULL) as "ScreeningStrategySubWedge.axisEnd" ,
(select count(*) from ScreeningStrategySubWedge where exposureTime is NULL) as "ScreeningStrategySubWedge.exposureTime" ,
(select count(*) from ScreeningStrategySubWedge where transmission is NULL) as "ScreeningStrategySubWedge.transmission" ,
(select count(*) from ScreeningStrategySubWedge where oscillationRange is NULL) as "ScreeningStrategySubWedge.oscillationRange" ,
(select count(*) from ScreeningStrategySubWedge where completeness is NULL) as "ScreeningStrategySubWedge.completeness" ,
(select count(*) from ScreeningStrategySubWedge where multiplicity is NULL) as "ScreeningStrategySubWedge.multiplicity" ,
(select count(*) from ScreeningStrategySubWedge where doseTotal is NULL) as "ScreeningStrategySubWedge.doseTotal" ,
(select count(*) from ScreeningStrategySubWedge where numberOfImages is NULL) as "ScreeningStrategySubWedge.numberOfImages" ,
(select count(*) from ScreeningStrategySubWedge where comments is NULL) as "ScreeningStrategySubWedge.comments" ,
(select count(*) from ScreeningStrategySubWedge where resolution is NULL) as "ScreeningStrategySubWedge.resolution" ,
(select count(*) from ScreeningStrategyWedge where screeningStrategyWedgeId is NULL) as "ScreeningStrategyWedge.screeningStrategyWedgeId" ,
(select count(*) from ScreeningStrategyWedge where screeningStrategyId is NULL) as "ScreeningStrategyWedge.screeningStrategyId" ,
(select count(*) from ScreeningStrategyWedge where wedgeNumber is NULL) as "ScreeningStrategyWedge.wedgeNumber" ,
(select count(*) from ScreeningStrategyWedge where resolution is NULL) as "ScreeningStrategyWedge.resolution" ,
(select count(*) from ScreeningStrategyWedge where completeness is NULL) as "ScreeningStrategyWedge.completeness" ,
(select count(*) from ScreeningStrategyWedge where multiplicity is NULL) as "ScreeningStrategyWedge.multiplicity" ,
(select count(*) from ScreeningStrategyWedge where doseTotal is NULL) as "ScreeningStrategyWedge.doseTotal" ,
(select count(*) from ScreeningStrategyWedge where numberOfImages is NULL) as "ScreeningStrategyWedge.numberOfImages" ,
(select count(*) from ScreeningStrategyWedge where phi is NULL) as "ScreeningStrategyWedge.phi" ,
(select count(*) from ScreeningStrategyWedge where kappa is NULL) as "ScreeningStrategyWedge.kappa" ,
(select count(*) from ScreeningStrategyWedge where chi is NULL) as "ScreeningStrategyWedge.chi" ,
(select count(*) from ScreeningStrategyWedge where comments is NULL) as "ScreeningStrategyWedge.comments" ,
(select count(*) from ScreeningStrategyWedge where wavelength is NULL) as "ScreeningStrategyWedge.wavelength" 
from Screening limit 1;
antolinos commented 4 years ago

We propose to remove the following columns:

Screening.shortComments
Screening.xmlSampleInformation
ScreeningOutput.doseTotal
ScreeningOutput.indexingSuccess
ScreeningStrategySubWedge.doseTotal
ScreeningStrategySubWedge.resolution
ScreeningStrategyWedge.doseTotal
ScreeningStrategyWedge.chi

and the following tables:

ScreeningRank
ScreeningRankSet

Screening - Sheet1.pdf

The resulting schema is:

Screenshot from 2019-09-11 15:07:01

stufisher commented 4 years ago

let me check but i think we use: shortcomments indexingSuccess (or were planning to for status monitoring) chi

antolinos commented 4 years ago

Proposal 2.0

For a second (and more agressive) simplification, we have realized that relations between:

are really 1 to 1 and not 1 to N what means that we could eventually merge these tables

These SQL answer to the question if they are 1 to 1 or 1 to N

SELECT 
    screeningId, 
    COUNT(screeningId)
FROM
    ScreeningOutput
GROUP BY screeningId
HAVING COUNT(screeningId) > 1;

SELECT 
    screeningStrategyWedgeId, 
    COUNT(screeningStrategyWedgeId)
FROM
    ScreeningStrategyWedge
GROUP BY screeningStrategyWedgeId
HAVING COUNT(screeningStrategyWedgeId) > 1;

The result of merging Screening and ScreeningOutput is: Screenshot from 2019-09-11 15:19:29

For ScreeningStrategy and ScreeningStrategyWedge there are some columns that are repeated however we are populating the same values:

select count(*) from ScreeningStrategy
LEFT JOIN ScreeningStrategyWedge on ScreeningStrategyWedge.screeningStrategyId = ScreeningStrategy.screeningStrategyId
where 
ScreeningStrategy.resolution <> ScreeningStrategyWedge.resolution
or 
ScreeningStrategy.completeness <> ScreeningStrategyWedge.completeness
or 
ScreeningStrategy.multiplicity <> ScreeningStrategyWedge.multiplicity;

It means that could also be merged, The result can be seen at: Screenshot from 2019-09-11 15:40:02

antolinos commented 4 years ago

Proposal 3.0

And even more agressive proposal could be done by taking into account the parameters that are shown in the UI. Screening data is accesible via a database's view named v_datacollection_summary

These parameters are:

       `v_datacollection_summary_screening`.`Screening_screeningId` AS `Screening_screeningId`,
        `v_datacollection_summary_screening`.`Screening_dataCollectionId` AS `Screening_dataCollectionId`,
        `v_datacollection_summary_screening`.`Screening_dataCollectionGroupId` AS `Screening_dataCollectionGroupId`,
        `v_datacollection_summary_screening`.`ScreeningOutput_strategySuccess` AS `ScreeningOutput_strategySuccess`,
        `v_datacollection_summary_screening`.`ScreeningOutput_indexingSuccess` AS `ScreeningOutput_indexingSuccess`,
        `v_datacollection_summary_screening`.`ScreeningOutput_rankingResolution` AS `ScreeningOutput_rankingResolution`,
        `v_datacollection_summary_screening`.`ScreeningOutput_mosaicity` AS `ScreeningOutput_mosaicity`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_spaceGroup` AS `ScreeningOutputLattice_spaceGroup`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_a` AS `ScreeningOutputLattice_unitCell_a`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_b` AS `ScreeningOutputLattice_unitCell_b`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_c` AS `ScreeningOutputLattice_unitCell_c`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_alpha` AS `ScreeningOutputLattice_unitCell_alpha`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_beta` AS `ScreeningOutputLattice_unitCell_beta`,
        `v_datacollection_summary_screening`.`ScreeningOutputLattice_unitCell_gamma` AS `ScreeningOutputLattice_unitCell_gamma`,
        `v_datacollection_summary_screening`.`ScreeningOutput_totalExposureTime` AS `ScreeningOutput_totalExposureTime`,
        `v_datacollection_summary_screening`.`ScreeningOutput_totalRotationRange` AS `ScreeningOutput_totalRotationRange`,
        `v_datacollection_summary_screening`.`ScreeningOutput_totalNumberOfImages` AS `ScreeningOutput_totalNumberOfImages`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_exposureTime` AS `ScreeningStrategySubWedge_exposureTime`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_transmission` AS `ScreeningStrategySubWedge_transmission`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_oscillationRange` AS `ScreeningStrategySubWedge_oscillationRange`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_numberOfImages` AS `ScreeningStrategySubWedge_numberOfImages`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_multiplicity` AS `ScreeningStrategySubWedge_multiplicity`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_completeness` AS `ScreeningStrategySubWedge_completeness`,
        `v_datacollection_summary_screening`.`ScreeningStrategySubWedge_axisStart` AS `ScreeningStrategySubWedge_axisStart`,

By removing the unused values the shema is: Screenshot from 2019-09-11 16:31:46

Then wedge and subwedge can be merged: Screenshot from 2019-09-11 16:33:57

antolinos commented 4 years ago

This is at the end the information we showed in the UI like:

download

stufisher commented 4 years ago

We show more info than this, and some programs generate multi-wedge/multi-subwedge strategies. Do you generate things like this at esrf, and how does exi display this info?

stufisher commented 4 years ago

The image looks like a full data collection output, rather than a screening output?

stufisher commented 4 years ago

This needs tabbing as we have a lot of info these days, but you get the idea: Screen Shot 2019-09-11 at 16 44 25

stufisher commented 4 years ago

I think we could do with graeme and olofs input on this. We certainly generate multiwedge, and possibly multi subwedge strategies, but i need to understand a bit more what their relevance is. Which means you can't merge wedge / subwedge

antolinos commented 4 years ago

All this rationale is based on the information stored and displayed on ISPyB. In theory, except if I am missing something, we could remove all these columns and tables and nobody will realize the change...

graeme-winter commented 4 years ago

Tuning in from Diamond at the request of @stufisher -

To unpack:

We do not typically use the subwedge feature of BEST

KarlLevik commented 4 years ago

Data from Diamond - output from @antolinos second query:

                            Screening.screeningId: 0
                      Screening.diffractionPlanId: 3268241
                  Screening.dataCollectionGroupId: 493670
                       Screening.dataCollectionId: 64
                            Screening.bltimeStamp: 888158
                         Screening.programVersion: 1823
                               Screening.comments: 1171527
                          Screening.shortComments: 62449
                   Screening.xmlSampleInformation: 3268241
                ScreeningOutput.screeningOutputId: 0
                      ScreeningOutput.screeningId: 0
                ScreeningOutput.statusDescription: 1071738
              ScreeningOutput.rejectedReflections: 2872832
               ScreeningOutput.resolutionObtained: 2870560
                   ScreeningOutput.spotDeviationR: 1823998
               ScreeningOutput.spotDeviationTheta: 2184137
                       ScreeningOutput.beamShiftX: 1823998
                       ScreeningOutput.beamShiftY: 1823998
                    ScreeningOutput.numSpotsFound: 1823998
                     ScreeningOutput.numSpotsUsed: 1823998
                 ScreeningOutput.numSpotsRejected: 1823998
                        ScreeningOutput.mosaicity: 948650
                       ScreeningOutput.iOverSigma: 2872832
                 ScreeningOutput.diffractionRings: 2870547
                  ScreeningOutput.strategySuccess: 0
               ScreeningOutput.mosaicityEstimated: 0
                ScreeningOutput.rankingResolution: 2875193
                          ScreeningOutput.program: 2836554
                        ScreeningOutput.doseTotal: 2875220
                ScreeningOutput.totalExposureTime: 2875207
               ScreeningOutput.totalRotationRange: 2875207
              ScreeningOutput.totalNumberOfImages: 2875207
                         ScreeningOutput.rFriedel: 2875220
                  ScreeningOutput.indexingSuccess: 0
                 ScreeningOutput.screeningSuccess: 0
  ScreeningOutputLattice.screeningOutputLatticeId: 0
         ScreeningOutputLattice.screeningOutputId: 0
                ScreeningOutputLattice.spaceGroup: 0
                ScreeningOutputLattice.pointGroup: 1958282
            ScreeningOutputLattice.bravaisLattice: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_a_x: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_a_y: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_a_z: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_b_x: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_b_y: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_b_z: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_c_x: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_c_y: 1958282
  ScreeningOutputLattice.rawOrientationMatrix_c_z: 1958282
                ScreeningOutputLattice.unitCell_a: 0
                ScreeningOutputLattice.unitCell_b: 0
                ScreeningOutputLattice.unitCell_c: 0
            ScreeningOutputLattice.unitCell_alpha: 0
             ScreeningOutputLattice.unitCell_beta: 0
            ScreeningOutputLattice.unitCell_gamma: 0
               ScreeningOutputLattice.bltimeStamp: 1161433
           ScreeningOutputLattice.labelitIndexing: 36394
                    ScreeningRank.screeningRankId: 0
                 ScreeningRank.screeningRankSetId: 0
                        ScreeningRank.screeningId: 0
                          ScreeningRank.rankValue: 0
                    ScreeningRank.rankInformation: 0
              ScreeningRankSet.screeningRankSetId: 0
                      ScreeningRankSet.rankEngine: 1
          ScreeningRankSet.rankingProjectFileName: 1
          ScreeningRankSet.rankingSummaryFileName: 1
            ScreeningStrategy.screeningStrategyId: 0
              ScreeningStrategy.screeningOutputId: 0
                       ScreeningStrategy.phiStart: 2862206
                         ScreeningStrategy.phiEnd: 2862206
                       ScreeningStrategy.rotation: 2862234
                   ScreeningStrategy.exposureTime: 1072190
                     ScreeningStrategy.resolution: 2862234
                   ScreeningStrategy.completeness: 2864481
                   ScreeningStrategy.multiplicity: 2864481
                      ScreeningStrategy.anomalous: 0
                        ScreeningStrategy.program: 158244
              ScreeningStrategy.rankingResolution: 1825286
                   ScreeningStrategy.transmission: 2864446

ScreeningStrategySubWedge.screeningStrategySubWedgeId: 0 ScreeningStrategySubWedge.screeningStrategyWedgeId: 4 ScreeningStrategySubWedge.subWedgeNumber: 875326 ScreeningStrategySubWedge.rotationAxis: 13 ScreeningStrategySubWedge.axisStart: 18 ScreeningStrategySubWedge.axisEnd: 18 ScreeningStrategySubWedge.exposureTime: 20975 ScreeningStrategySubWedge.transmission: 738181 ScreeningStrategySubWedge.oscillationRange: 8 ScreeningStrategySubWedge.completeness: 28384 ScreeningStrategySubWedge.multiplicity: 924671 ScreeningStrategySubWedge.doseTotal: 1754587 ScreeningStrategySubWedge.numberOfImages: 3513 ScreeningStrategySubWedge.comments: 1771745 ScreeningStrategySubWedge.resolution: 21 ScreeningStrategyWedge.screeningStrategyWedgeId: 0 ScreeningStrategyWedge.screeningStrategyId: 0 ScreeningStrategyWedge.wedgeNumber: 157727 ScreeningStrategyWedge.resolution: 196995 ScreeningStrategyWedge.completeness: 197057 ScreeningStrategyWedge.multiplicity: 1093346 ScreeningStrategyWedge.doseTotal: 1096864 ScreeningStrategyWedge.numberOfImages: 200487 ScreeningStrategyWedge.phi: 1813020 ScreeningStrategyWedge.kappa: 1902549 ScreeningStrategyWedge.chi: 1892181 ScreeningStrategyWedge.comments: 1921620 ScreeningStrategyWedge.wavelength: 1981727

KarlLevik commented 4 years ago

The 3rd queries which is counting ScreeningOutputs pointing to the same Screening parent row is empty. I.e we don't have multiple ScreeningOutputs for the same Screening. I.e. this appears to be a 1-to-1 relationship.

I think the 4th query isn't quite right - I think it should be:

SELECT screeningStrategyId,      
COUNT(screeningStrategyId) 
FROM  ScreeningStrategyWedge 
GROUP BY screeningStrategyId 
HAVING COUNT(screeningStrategyId) > 1;

This gives a lot of results for us - i.e. we have many examples of multiple wedges per strategy in the Diamond database. I.e. this is a real many-to-one relationship.

antolinos commented 4 years ago

It is decided to double check if this schema is possible: 64706429-bb688580-d4b1-11e9-9f1f-20463bab7a83

Olof proposes to change the name of the tables screening by Strategy

stufisher commented 4 years ago

would like to add autoprocprogram as the precursor, with fk from strategy to autoprocprogram, and remove autoprocprogram columns from strategy (programversion, bltimestamp, etc)