Open saeid-p opened 8 years ago
In order to expand sorting to Author, Date Published, Repository, and Title, we need to have these fields in the mapping of all repositories in ElasticSearch. Currently, some of these fields are missing in some repositories and in some cases (Date Published), there is no consistency in the naming of the field. For instance, the publication date has been stored with different names in different repositories. The following are few instances:
Generally, in order to utilize ElasticSearch sorting functionality, we need to have a consistent field for each of these fields, with consistent naming and model in the index mapping. If repository doesn't provide the field, it can be stored without value, but the mapping of the field should be exist in all repositories.
One workaround to fix this issue is to re-sort the results after running the search. This approach has been explained here. However, this solution will impact the performance of the search and increases application response time.
@jgrethe what's your opinion about this issue?
Need to make sure that all of these are actually in the mapping.
@aegururaj I attached a list of missing or invalid fields in the current release.
Repository | Title | Date Published | Author |
---|---|---|---|
ArrayExpress | title (string) | dateReleased (date) | NOT FOUND |
BioProject | title (string) | dateReleased (date) | NOT FOUND |
CIA | title (string) | [datasetdistribution].dateReleased (string) | NOT FOUND |
CIL | title (string) | NOT FOUND | NOT FOUND |
ClinicalTrials | title (string) | [datasetdistribution].dateReleased (string) | creator (string) |
CTN | title (string) | [datasetdistribution].dateReleased (date) | creator (string) |
CVRG | title (string) | [datasetdistribution].dateReleased (string) | NOT FOUND |
DataVerse | title (string) | dateReleased (string) | NOT FOUND |
Dryad | title (string) | [datasetdistribution].dateReleased (string) | creator (string) |
Gemma | title (string) | NOT FOUND | NOT FOUND |
Geo | title (string) | [datasetdistribution].dateReleased (string) | NOT FOUND |
Lincs | title (string) | dateReleased (string) | NOT FOUND |
MPD | title (string) | NOT FOUND | NOT FOUND |
Neuromorpho | title (string) | NOT FOUND | NOT FOUND |
Niddkcr | title (string) | NOT FOUND | NOT FOUND |
NursaDatasets | title (string) | NOT FOUND | NOT FOUND |
OpenFMRI | title (string) | NOT FOUND | NOT FOUND |
PDB | title (string) | dateReleased (date) | citation.author (string) |
Peptideatlas | title (string) | NOT FOUND | NOT FOUND |
dbGaP | title (string) | NOT FOUND | NOT FOUND |
Physiobank | title (string) | NOT FOUND | NOT FOUND |
ProteomExchange | title (string) | dateReleased (string) | NOT FOUND |
yped | title (string) | NOT FOUND | NOT FOUND |
@yul129 : can you verify that these have all been corrected for the upcoming data run.
Here is the updated status of mapping the fields https://docs.google.com/spreadsheets/d/1I8Cr0IH5rmzVO9NzRGv7vLQbRTgQwXsb5iaO107-hms/edit#gid=0
Hi @yul129 ,
This document is not public. We need your permission to open it.
I just updated the sharing setting, it should be public now.
From: RuilingLiu [notifications@github.com] Sent: Friday, August 19, 2016 2:34 PM To: biocaddie/prototype_issues Cc: Yueling Li; Mention Subject: Re: [biocaddie/prototype_issues] Add New Fields To Sorting (#72)
Hi @yul129https://github.com/yul129 ,
This is document is not public. We need your permission to open it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/biocaddie/prototype_issues/issues/72#issuecomment-241141165, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALPhs1nzvhOdUNS09yhBUbNgCfRQSKqXks5qhiFVgaJpZM4IEdaW.
This feedback has been collected from survey feedback questionnaire form. In order to protect user's privacy, the personal information has been removed.
Sorting should be expanded to author, by date published, repository, title and we should be able to select the number of results I want to see per page. Additionally, export functions to citation managers is a good feature to have.