SciCatProject / scicat-backend-next

SciCat Data Catalogue Backend
https://scicatproject.github.io/documentation/
BSD 3-Clause "New" or "Revised" License
18 stars 21 forks source link

Scientific Metadata with / without '.value' (or ' .v') #138

Open linupi opened 1 year ago

linupi commented 1 year ago

Scientific Metadata with / without '.value

Summary

when working on https://github.com/SciCatProject/scicat-backend-next/pull/119 we saw that for the new SciCat backend there is currently no support for metadata outside the '.value', '.unit' scheme. In https://github.com/SciCatProject/scicat-backend-next/issues/119#issuecomment-1201236240_ there is an example of scientific metadata from PSI that falls in this category:

image

Originally posted by @stephan271 in https://github.com/SciCatProject/scicat-backend-next/issues/119#issuecomment-1201236240

Expected Behaviour

like in the old backend the frontend search should work also on any string and integer values in the metadata without having a 'unit' / 'value'. Further in the example above 'v' and 'u' instead of 'value' and 'unit'.

I have the feeling it would make sense to provide a sort of generic helper to deal with the different options of having .value, .value+.unit or none of them. Currently '.value' is hardcoded in the new backend code e.g.

src/common/utils.ts:20:    return { valueSI: Number(quantity.value), unitSI: quantity.unit };
src/common/utils.ts:92:    const matchKeyMeasurement = `scientificMetadata.${lhs}.valueSI`;
src/common/utils.ts:97:        scientificFilterQuery[`${matchKeyGeneric}.value`] = { $eq: rhs };
src/common/utils.ts:102:        scientificFilterQuery[`${matchKeyGeneric}.value`] = { $eq: escapedRhs };
src/common/utils.ts:111:          scientificFilterQuery[`${matchKeyGeneric}.value`] = { $eq: rhs };
src/common/utils.ts:121:          scientificFilterQuery[`${matchKeyGeneric}.value`] = { $gt: rhs };
src/common/utils.ts:131:          scientificFilterQuery[`${matchKeyGeneric}.value`] = { $lt: rhs };
src/datasets/interceptors/fullquery.interceptor.ts:34:                `${lhs}.value`,
src/datasets/interceptors/fullquery.interceptor.ts:46:                  `${lhs}.value`,
nitrosx commented 1 year ago

@linupi : I agree with your suggestion about helper functions when retrieving, updating and query metadata. Personally I would allow the following 2 cases:

I would also provide a script/function to manage the transition from current status to the one above, so we transform the v and u keys to value and unit respectively

bpedersen2 commented 1 year ago

It looks like 'string' metadata use the second form ( value + empty unit) when creating them from the frontend.

bpedersen2 commented 10 months ago

ping...

rkweehinzmann commented 1 month ago

Going through this issue in more detail it seems to me we do not need any generalised helper to deal with different cases as suggested by @linupi back in 2022, but rather implement value + optional unit as mentioned here by @nitrosx : property : { 'value': value, 'unit': '' } This should solve #119, ie. the issue of finding meta data by inputting only part of a string without any unit, right?

linupi commented 1 month ago

well back in 2022 I just described the discrepancy of what existed in the old backend compared to the new one. The old one generally worked with property: 'value' as well as property:{value:'value', unit:''} for the exact matching of strings so it was just natural to apply the same logic also for the partial matching. In the end it is a design choice if property: 'value' would be acceptable or not - and i could probably survive with both options. To me it still makes sense to have such an option since property:{value:'my string', unit:'an_existing_unit'} would need to be carefully filtered and forbidden I guess. Further, if you look at what @stephan271 posted in https://github.com/SciCatProject/scicat-backend-next/issues/119#issuecomment-1201236240_ an enforcement of property:{value:'value', unit:''} would break with the current practice

image

so in conclusion I strongly tend to suggest to not enforce unit value pairs, as this is not always a correct representation. Looking at the ROI in the last rows of above's screenshot one could even debate if units should even be enforced on numeric values as well.

nitrosx commented 1 week ago

I think we should refactor the code to accept with and without the value field to make it backward compatible with alder entries. At ESS, we are going to use the value and unit fields.

nitrosx commented 1 week ago

I propose that SciCat can correctly work and represent the following cases:

I will leave it up to each facilities if they want to enforce any of the above formats and which ones