greenelab / word-lapse

Explore how a word changes over time
https://greenelab.github.io/word-lapse/
Other
6 stars 3 forks source link

normalization #60

Closed MathCancer closed 2 years ago

MathCancer commented 2 years ago

As per slack discussion, it might be good to allow normalization based upon a regularly occurring term such as "actin" or even "control".

danich1 commented 2 years ago

As per slack discussion, it might be good to allow normalization based upon a regularly occurring term such as "actin" or even "control".

Normalization as in 'controls' -> 'control' or 'actins' -> 'actin'? If yes, then I can look into it. I was already using spacy's lemmatizer, but unsurprisingly it isn't great for normalizing scientific words.

MathCancer commented 2 years ago

Using some commonly occurring word as a control for the search term, to normalize results for growth in number of papers.

search( "term" ) ./ search( "some control term" ).

@cgreene and I poked around at this a bit, and words used in most bio papers could be good controls / normalizations. The term "control" actually looks promising. So does "actin" (since so many western blots use beta-actin in the control lanes).

vincerubinetti commented 2 years ago

I believe this was incorporated on all fronts (model, backend, frontend with #63)?