ome / omero-gallery

https://pypi.org/project/omero-gallery/
GNU Affero General Public License v3.0
5 stars 15 forks source link

Feedback idr fixes #35

Closed will-moore closed 5 years ago

will-moore commented 5 years ago

Various fixes from feedback on the IDR gallery.

To test:

will-moore commented 5 years ago

Trying to improve regex to get surname only from all Publication Authors... Mix of names (produced by splitting the Publication Authors value by , or and or &). E.g:

I'm not even sure what the surname is for the last 2. If anyone can figure out a rule that gives the correct surname from each of these cases, please let me know. cc @sbesson @francesw. Currently I'm splitting on ' ', filtering for words that contain a lowercase. Then if there are 1 words, that's the surname, if there are 2 or more words, I ignore the first word (first name) and join the rest. e.g. Francesco Paolo Casale -> Paolo Casale But also Petri Seiler K -> Seiler and M. Julius Hossain -> Hossain which I think are probably wrong??

manics commented 5 years ago

If you've figured out a rule we could use it to automatically extract the surnames and enter them as individual key-value pairs. Or maybe have each author as a separate map-ann in an /author namespace

sbesson commented 5 years ago

Re publication authors, my inclination would be to work towards unifying the formatting of the Publication Authors key in the study files and their representation in IDR. @manics has proposed a few ways to store the data as map annotations. For the study files, I can certainly conceive using the Pubmed style everywhere especially since we use PubMed ID as the primary publication identifier i.e. Last Name 1 <Initials 1>, Last Name 2 <Initials 2>. Probably something worth discussing on @francesw return.

joshmoore commented 5 years ago
joshmoore commented 5 years ago

tl;dr - happy for this to be merged if there aren't any easy fixes that @will-moore wants to get in.

will-moore commented 5 years ago

Nothing else to add right now, thanks.