yjunechoe / Snowglobe

Snowglobe
http://snowglobe.soc.northwestern.edu/
2 stars 0 forks source link

Multiple input handling #11

Closed mcweenysean closed 3 years ago

mcweenysean commented 3 years ago

Ran into this issue and it definitely needs solving. DOIs are case sensitive (separate issue) but here it meant that all the inputs with "bad" DOIs weren't found, even though they had titles that worked once I removed the PMID and DOI columns.

I've been swamped - any help on this would be appreciated, but I will get to it if not.

HCNSnowGlobeTitlesEdit.txt

mcweenysean commented 3 years ago

DOIs are all upper-case apparently so forcing them to upper was a super easy fix. Was worried that was gonna be messier than it was. Multiple input handling still needs to be fixed though.

yjunechoe commented 3 years ago

What was the exact issue with multiple input handling again? So a paper with searchable title doesn't work when its DOI doesn't work? What is returned? (error? a row with all NAs? Something else?)

My guess is that something's up in the part of the code that handles search order:

https://github.com/yjunechoe/Snowglobe/blob/6d4099d650ba132b70fcd9999649dc26818d1857/server.R#L71-L86

This is supposed to make it so that you search for DOI first, and if that fails search for Title, then PMID. It might be getting caught on the DOI search failing and not moving onto title search as it's supposed to. I kinda hacked my way to this code so I wouldn't be surprised this is causing the problem.

mcweenysean commented 3 years ago

haha yes that's exactly the issue. If the DOI is not correct, it doesn't proceed to title or PMID search. It doesn't return NA but rather an error, so we'll have to tryCatch() likely.

yjunechoe commented 3 years ago

So the possibly_na() function that I use is a wrapper around purrr::possibly(), which is basically tryCatch() but with a more sanitized output.

https://github.com/yjunechoe/Snowglobe/blob/6d4099d650ba132b70fcd9999649dc26818d1857/server.R#L55-L57

Actually, although the code isn't super neat I'd be surprised if this is part is the issue because I remember checking for that edge case where DOI is weird but other info is fine. Can you double check this by like commenting out the lines that attempt a DOI search (L72-L74)? Does that avoid triggering an error?

mcweenysean commented 3 years ago

Huh - I can't seem to recreate it now. I'm not totally sure what happened then. I uploaded a list of 82 (with title, doi and pmid) and it only returned 16 at the time. Now its returning 79 even if I fuck with the DOI. I'm going to close this for now, but I wouldn't be surprised if something like this comes back up but we should be ok.

yjunechoe commented 3 years ago

RE: the upper case DOI thing, it looks like I actually ran into this and fixed it in 84fa2ba8bcd6cca55143be8946558cb3cdf865ef except I uppercased the input before it was passed into the function, so it wasn't a real fix.

I'm going to remove the toupper since that's redundant now with your fix