antleaf / jct-user-stories

Repository for collecting. discusing and categorising user stories supporting the development of the Plan S Journal Tracking Tool
1 stars 0 forks source link

As a RESEARCHER, I want to be able to get search results for journal names with special characters whether I’m using a special keyboard, inserting special characters, or not using special characters. #43

Open louisecpage opened 4 years ago

markmacgillivray commented 4 years ago

@paulwalk @richard-jones I've discussed this on other projects before, there are some things to consider as it is not a straightforward decision. Supporting search results for special characters is not hard in itself, and neither is supporting search without special characters, but supporting either OR both requires a decision about what experience is intended for the end user, and what the service wants to deliver. See below:

hjh33 commented 4 years ago

@markmacgillivray when you've discussed this on other projects has it included people whose native languages use special characters and what has been the outcome of the discussions on other projects?

Would fuzzy matching operate within the autocomplete suggestions and incorporate mistypes and special character usage (or not) within the results returned?

markmacgillivray commented 4 years ago

@hjh33 I've provided similar technical points but haven't been involved in the decisions of those projects - one of them was DOAJ though, I think, so @richard-jones may know what they chose for their user base, which does have some users with native languages that contain special characters.

About fuzzy matching, there's an issue that already relates to it, #36 which is in future requirements. I'd need to check if fuzzy matches would incorporate special characters, but I believe it would.

Note, there will always be some limit to what "special characters" we can match. If we are talking about things like umlauts in human languages, that sort of thing should be fine. But if we are talking about things like special mathematical characters that certain users may use in the titles of things, we probably can't match those, because they usually get mangled during the ingest process before reaching us, e.g. crossref etc don't get them accurately from the publishers that submit them. However, this is something I've seen at article title level rather than at journal title level, but I'd guess there could be some obscure mathematical journal in the world somewhere that uses special characters that we will never be able to match.

I think the best solution for now is that when we know a journal title contains special characters (because we receive it that way from crossref for example) then we will store it that way. If a user then searches with the special characters, we will match it. Also, if the UI supports autocompletion then while the user is typing they will see possible titles appear anyway, and this may aid in them being able to select the correct one (and this also aids in minimising typing/spelling mistakes). Later, in future requirements, we can also add fuzzy matching if feedback from the first release indicates that users are too often having trouble finding the correct journal. Does that sound suitable to you?

hjh33 commented 4 years ago

@markmacgillivray that seems like a logical suggestion. Would be good to know @richard-jones what DOAJ went with.