Open pstoehr opened 9 years ago
The rationale behind the result group is to group similar items, e.g. multiple, very similar, but slightly different views of the same ancient coin. And we need an representative for the result group as well. With this context in mind, the above format is also not completely clear to me. As far as I know now other client currently uses this information, which might explain why nobody else has stumbled across this yet.
For the format discussion, I will assign Thomas from the PP group, who possibly directly assigns this further to the recommender.
With this piece of information in mind ... Shouldn't a resultGroup consists of at least two entries?
The result group holds other results that were considered near duplicates of the result already added in the result list. Therefore there might be 1 to n documents in the group. It doesn't hold only DocumentBadges it hold a list of complete results. The result groups within the already grouped documents will always be empty. I will have a look if we can remove the group within the grouped documents to avoid further confusion about it.
So, let me see if I understood this correctly. For example, I get a list of 4 results res 1, res2, res3, res4 Let's say result 2 has additionally very similar results res2a, res2b . Then the latter would be in the result group added to res2? res1, res3, res4 would not have an result group added (or an empty one)?
yes, basic idea is just to remove duplicates and near duplicates from the list. But it might be still useful information for some views (e.g. showing only images) so we didn't want to remove that items and instead stick it to the result which we considered being similar or equal.
and in the above example, the numberOfResults would be 4 or 6?
numberOfResults should be 4 since the groups are not counted
Thanks for the clarification!
But the PP returned the following entry: { "resultGroup": [ { "resultGroup": [
],
"documentBadge": {
"id": "\/2022343\/D0807F3F4D94D1529BDAA3D13DFB462E64B7B6CF",
"uri": "http:\/\/europeana.eu\/resolve\/record\/2022343\/D0807F3F4D94D1529BDAA3D13DFB462E64B7B6CF",
"provider": "Europeana"
},
"mediaType": "IMAGE",
"previewImage": "http:\/\/europeanastatic.eu\/api\/image?uri=http%3A%2F%2Fwww.culturegrid.org.uk%2Fdpp%2Fresource%2F3075231%2Fstream%2Fthumbnail_image_jpeg&size=LARGE&type=IMAGE",
"title": "Netsuke in form of women giving birth",
"date": "1900-01-01",
"language": "en",
"licence": "http:\/\/www.europeana.eu\/rights\/rr-f\/",
"generatingQuery": "(giving AND birth)"
}
], "documentBadge": { "id": "\/2022343\/D0807F3F4D94D1529BDAA3D13DFB462E64B7B6CF", "uri": "http:\/\/europeana.eu\/resolve\/record\/2022343\/D0807F3F4D94D1529BDAA3D13DFB462E64B7B6CF", "provider": "Europeana" }, "mediaType": "IMAGE", "previewImage": "http:\/\/europeanastatic.eu\/api\/image?uri=http%3A%2F%2Fwww.culturegrid.org.uk%2Fdpp%2Fresource%2F3075231%2Fstream%2Fthumbnail_image_jpeg&size=LARGE&type=IMAGE", "title": "Netsuke in form of women giving birth", "date": "1701-01-01", "language": "en", "licence": "http:\/\/www.europeana.eu\/rights\/rr-f\/", "generatingQuery": "(giving AND birth)" },
Therefor I have two additional questions: 1) Why is there an empty resultGroup as a member of the resultGroup? That is the one that can be ignored? 2) Why is the same document referenced twice?
1.) That was what i meant with "I will have a look if we can remove the group within the grouped documents to avoid further confusion about it.". Both objects use the same representation within the system so they are generated in the same way. Therefore there is a result group within the already grouped objects but it will never be not empty.
2.) It looks like somehow Europeana returned the same document twice. If you have a look at the date attribute of both results you can see that, although the "uri" and "id" is identical, the date varies. So either that’s a problem in the index of Europeana or it's a problem of the transformation. We should ask @jr-dig-orgel about that.
Thanks for the clarifications!
for me its seems that there is a problem in the index of europeana - the transformation seems to work correctly.
Interestingly, the date in the original Europeana entry is a range: 1700-1900 http://www.europeana.eu/portal/record/2022343/D0807F3F4D94D1529BDAA3D13DFB462E64B7B6CF.html?start=1&query=Netsuke+in+form+of+women+giving+birth&startPage=1&qt=false&rows=24
Thomas, could it be, that the transformation creates two entries for entries with time range, one with the starting date and one with the end date?
@chseifert thanks for the hint -> I will check this
Based on the documentation we assumed that a resultGroup entry is either empty or it consists of documentBadge entries. But the PP also returns results where a resultGroup entry consists of an empty resultGroup and an additional documentBadge entry.
Is a resultGroup defined as a list/array of documentBadges or can it contain in addition resultGroup recursively?
Endpoint: https://eexcess-dev.joanneum.at/eexcess-privacy-proxy-issuer-1.0-SNAPSHOT/issuer/recommend
Example: { "provider": "federated", "totalResults": 10, "partnerResponseState": [ { "systemID": "Deutsche Digitale Bibliothek", "success": true }, { "systemID": "Europeana", "success": true }, { "systemID": "Kierling", "success": true }, { "systemID": "Mendeley", "success": true }, { "systemID": "KIMPortal", "success": true }, { "systemID": "Wissenmedia", "success": true }, { "systemID": "ZBW", "success": true } ], "queryID": "1892381444", "result": [ { "resultGroup": [
] }