RESQUE-Framework / website

The Research Quality Evaluation Scheme
https://resque-framework.github.io/website/
MIT License
2 stars 3 forks source link

Meaningful Ids for indicators #32

Closed alpkaanaksu closed 9 months ago

alpkaanaksu commented 10 months ago

Criteria:

Old New
P3 Title
P4 Year
P5 DOI
p_top_paper P_TopPaper
P6 P_TypeMethod
P6_other P_TypeMethod_Other
P7 P_TypePublication
P7_other P_TypePublication_Other
P8 P_Suitable
P8_explanation P_Suitable_OptOut
P9_info P_CRediT_Info
P9a P_CRediT_Conceptualization
P9n P_CRediT_WritingReviewEditing
P10 P_Data
P11 P_Data_Open
P11_explanation P_Data_Open_NotApplicable
P11_extra P_Data_Open_Identifier
P12 P_Data_Open_AccessLevel
P12_explanation P_Data_Open_ZK2Explanation
P13 P_Data_Open_FAIR
P14 P_IndependentVerification
P14_explanation P_IndependentVerification_NotApplicable
P14_extra P_IndependentVerification_Identifier
P15 P_ReproducibleScripts
P15_explanation P_ReproducibleScripts_NotApplicable
P15_extra P_ReproducibleScripts_Identifier
P16 P_ReproducibleScripts_FAIR
P17 P_OpenMaterials
P17_explanation P_OpenMaterials_NotApplicable
P17_extra P_OpenMaterials_Identifier
P18 P_Preregistration
P18_explanation P_Preregistration_NotApplicable
P18_extra P_Preregistration_Identifier
P19 P_Preregistration_Content
P20 P_FormalModeling
P20_explanation P_FormalModeling_NotApplicable
P21 P_PreregisteredReplication
P21_explanation P_PreregisteredReplication_NotApplicable
P21_extra P_PreregisteredReplication_Identifier
P22 P_PowerConsiderations
P23 P_OpenScienceBadges
P24 P_SampleSize
P25 P_Merit
alpkaanaksu commented 10 months ago

Maybe use 'NA' instead of 'NotApplicable' and 'Id' instead of 'Identifier'? I am not sure if we need/want those abbreviations. (I am not a fan of abbreviations in general, I usually try to avoid them, especially when writing texts. But it might be okay to have them in ids)

ChatGPT says 'Identifier' is a common name for URL and DOI :)

The common name for both URL and DOI in the context of digital resources is "identifier". Both of these are unique identifiers used to locate and access specific resources on the internet.

nicebread commented 10 months ago

"NA" is ambiguous, as it means both "not available" (which implies 0 points and no reduction of the max points) and "not applicable".

I changed all "NotApplicable" to "NAExplanation" (violating my own reasoning above ...).

"ID" is fine.

minimal changes:

Old New
P3 Title
P4 Year
P5 DOI
p_top_paper P_TopPaper
P6 P_TypeMethod
P6_other P_TypeMethod_Other
P7 P_TypePublication
P7_other P_TypePublication_Other
P8 P_Suitable
P8_explanation P_Suitable_Explanation
P9_info P_CRediT_Info
P9a P_CRediT_Conceptualization
P9n P_CRediT_WritingReviewEditing
P10 P_Data
P11 P_Data_Open
P11_explanation P_Data_Open_NAExplanation
P11_extra P_Data_Open_Identifier
P12 P_Data_Open_AccessLevel
P12_explanation P_Data_Open_ZK2Explanation
P13 P_Data_Open_FAIR
P14 P_IndependentVerification
P14_explanation P_IndependentVerification_NAExplanation
P14_extra P_IndependentVerification_Identifier
P15 P_ReproducibleScripts
P15_explanation P_ReproducibleScripts_NAExplanation
P15_extra P_ReproducibleScripts_Identifier
P16 P_ReproducibleScripts_FAIR
P17 P_OpenMaterials
P17_explanation P_OpenMaterials_NAExplanation
P17_extra P_OpenMaterials_Identifier
P18 P_Preregistration
P18_explanation P_Preregistration_NAExplanation
P18_extra P_Preregistration_Identifier
P19 P_Preregistration_Content
P20 P_FormalModeling
P20_explanation P_FormalModeling_NAExplanation
P21 P_PreregisteredReplication
P21_explanation P_PreregisteredReplication_NAExplanation
P21_extra P_PreregisteredReplication_Identifier
P22 P_PowerConsiderations
P23 P_OpenScienceBadges
P24 P_SampleSize
P25 P_Merit
nicebread commented 10 months ago

Just a thought (if you haven't implemented them yet): Maybe add "has" and "is" to appropriate indicators? E.g. P10 / P_Data could beP_has_Data`.

alpkaanaksu commented 10 months ago

This is how I think about it: with '_', you go into a kind of subfield. 'P' is the root for publication indicators. 'P_Data' ist the parent node for all other data related items and is the entry point to the data subfield , 'P_Data_Open' is the parent node for all open data items and so on.

This '_has'/'_is' suffix implies that we have two new main categories for indicators, according to which 'P_is_TypePublication' and 'P_is_Suitable' are somehow related. They are both 'is' attributes, this is a similarity, but I don't know if this similarity is meaningful. I personally don't see any reason to give 'P_TypePublication' and 'P_Suitable' a common parent node.

Do you have some use cases for this in mind?

nicebread commented 10 months ago

OK, I unterstand your hierarchical logic. My idea was that the semantics are more intuitive ("P_has_open_data" (1/0) is directly understandable). But that probably implies another structure.

alpkaanaksu commented 10 months ago

We can think about creating that kind of ids: 'p_has_open_data' (you can literally read it like 'publication has open data'), 'p_is_preregistered_replication', 'p_open_data_access_level'. We lose the hierarchical structure but it is easier to read.

Hierarchical ids give us more information about the indicators, this is maybe more important than readibility? Which one is more important to you?

nicebread commented 10 months ago

I think I'd prefer the readable style. Could you add a third column to the table where we compare them? (Not necessarily all indicators, just for the first 15 or so to get an idea).

alpkaanaksu commented 10 months ago

Since '_' has no special meaning in ids with no hierarchical information, we can just use normal snake_case instead of the weird mix we had.

Old New (hierarchical) New (readable)
P3 Title title
P4 Year year
P5 DOI doi
p_top_paper P_TopPaper p_is_top_paper
P6 P_TypeMethod p_method
P6_other P_TypeMethod_Other p_method_other
P7 P_TypePublication p_type
P7_other P_TypePublication_Other p_type_other
P8 P_Suitable p_is_suitable
P8_explanation P_Suitable_Explanation p_is_suitable_explanation
P9_info P_CRediT_Info p_credit_info
P9a P_CRediT_Conceptualization p_credit_conceptualization
...
P10 P_Data p_has_data
P11 P_Data_Open p_has_open_data
P11_explanation P_Data_Open_NAExplanation p_has_open_data_na_explanation
P11_extra P_Data_Open_Identifier p_open_data_identifier
P12 P_Data_Open_AccessLevel p_open_data_access_level
P12_explanation P_Data_Open_AccesLevel_ZK2Explanation p_open_data_access_level_zk2_explanation
P13 P_Data_Open_FAIR p_is_open_data_fair
P14 P_IndependentVerification p_has_independent_verification
nicebread commented 10 months ago

After some discussion, we decided to stay with the "hierarchical" style (has some practical advantages, although at the cost of being slightly less intuitive)

alpkaanaksu commented 10 months ago

Replaced all Ids in indicator definitions and scoring. 3febc979dda55ae5f0c76ec160f51dee1fff0e18

We should test everything before closing this issue. This change can break a lot of things.

nicebread commented 10 months ago

My tests found no bug so far ...

alpkaanaksu commented 9 months ago

I think it is safe to assume that there are no bugs related to the new Ids. I think we would have found them by now.