camaradesuk / ASySD

https://camaradesuk.github.io/ASySD/
GNU General Public License v3.0
11 stars 5 forks source link

How did you come up with such a large number of filters? #15

Closed rohitgarud closed 1 year ago

rohitgarud commented 1 year ago

Sorry..This is not an issue, but as there is no discussion section, so asking it here. Did you follow some iterative method to add filters sequentially or use some kind of heuristic for it?

kaitlynhair commented 1 year ago

Hi - sorry I didn't reply sooner. I'm just tidying up a few of the outstanding issues now.
I developed ASySD in response to a specific problem I had - I couldn't dedup huge numbers effectively in Endnote and couldn't find another solution at the time. I used several datasets we had within our research group from past systematic reviews. I basically added filters I thought made sense and then did A LOT(!) of trial and error and manually checking false positives / false negatives on those testing datasets and tweaking the filters continually. When I was happy with it, I tested the final version on 5 unseen (but similar) datasets and it worked pretty well.

I really didn't follow a specific process back then as it was very experimental at the start. In future I plan to re-evaluate the match filters using a more systematic approach and adapt them to suit different article types / research databases.

rohitgarud commented 1 year ago

@kaitlynhair Thank you for your response... I am working on a similar tool and wanted to get some ideas for approaching the problem of filters..