connorjoleary / DeepCite

Traversing links to find the deep source of information
GNU General Public License v3.0
69 stars 7 forks source link

Improve model score #119

Open connorjoleary opened 2 years ago

connorjoleary commented 2 years ago

Is your feature request related to a problem? Please describe. Currently the model takes the similarity of each block of text to the parent. Then after every block is computed, it multiplies this score by it's similarity to the original claim. This algorithm leads to two problems:

Describe the solution you'd like A model which calculates the score for each layer and takes into account the similarity to the original claim (not just at the end) and weights deeper nodes as better.

Describe alternatives you've considered Potentially this is where we could introduce a simple ml model which creates these scores based on the data we have labeled, but this seems like overkill at this point.

Additional context Using a Jupyter notebook to rerun data would be a good way to test. Then you can see if the labeled data is getting higher or lower scores.