Closed valearna closed 4 years ago
We also need a manual blacklist for other entities such as CB2 strain
We decided to try to implement TFIDF based threshold and to compare the results with manual blacklists: https://docs.google.com/spreadsheets/d/1hpo3DCIcOX20mrLNOQk4KdU3Bal7bJJGIdiJSsRqbsw
closing as we are using TFIDF, if need be to add manual blacklists we will comment them in
From email conversation with @vanaukenk:
We could use the ratio of curated references in WB to TextpressoCentral hits using the gene name as a keyword search as an indication of the probability of being a false positive. The higher the ration the lower the probability.