FAIRplus / FAIR_wizard

https://www.ebi.ac.uk/ait/fair-wizard/
Apache License 2.0
2 stars 0 forks source link

ideas, improving the resource selection process by calculating relevance scores #23

Open FuqiX opened 3 years ago

FuqiX commented 3 years ago

Problem: The wizard returns too many resources.

Current behaviour (should be)

Answers: label_A label_B label_C In the database: Resource1: {label_A, label_B} {relatesto: N/A} Resource2:(label_B} {relatesto: Resource3} Resource3:(lables:N/A) Resource4:{relatesto:Resource3}

All four resources will be returned because R1,R2 has matching labels, and R3, R4 links to R2. though R4 might not be related.

One possible solution is to calculate a relevance score = matched labels + related resource.

For example, a resource can only be selected if it has more than one piece of evidence (>=2 matching labels, or one matching label + 1 relationshipInfo, 2 relationship info)

Also, there should be a cap on how many levels of relationships. For example R4 is not selected because it doesn’t have a direct relationship with R2

theisuru commented 3 years ago

Currently we only return label matched resources. We will include first degree relationships in next. We have to come up with a scoring criteria + cut off mark or exact inclusion rules ( can start with >=2 matching labels, or one matching label + 1 relationshipInfo, 2 relationship info)

FuqiX commented 3 years ago

is after #21