Improve model score - Githubissues

Is your feature request related to a problem? Please describe. Currently the model takes the similarity of each block of text to the parent. Then after every block is computed, it multiplies this score by it's similarity to the original claim. This algorithm leads to two problems:

The claims which are more dissimilar, but further down in the tree don't get scored highly. We should take into account how far down the source is an if it is very far down increase the score.
The algorithm can have runaway branches where it doesn't realize how far off of the claim it is going, because it only looks at the similarity to the parent.

Describe the solution you'd like A model which calculates the score for each layer and takes into account the similarity to the original claim (not just at the end) and weights deeper nodes as better.

Describe alternatives you've considered Potentially this is where we could introduce a simple ml model which creates these scores based on the data we have labeled, but this seems like overkill at this point.

Additional context Using a Jupyter notebook to rerun data would be a good way to test. Then you can see if the labeled data is getting higher or lower scores.

connorjoleary / DeepCite

Improve model score #119