-
Hello, my name is Taegun Harshbarger.
My idea was to take two books by Charles Dickens and compare their word frequencies. I would have to filter out a lot of articles and the such. I would do _A T…
-
## Description
I want to be able to search words that are not changed by typos, but because of similar sounding names
## Steps to reproduce
something like this exists for angolia
https://sp…
-
Hi.
I'm calculating cosine similarity based on the pre-trained word vectors, and getting these results:
- "good" & "bad": `0.6704173041669614`
- "great" & "awesome": `0.3765588570243393`
These r…
piwvh updated
5 years ago
-
in a query I have the formatter changes this regex `$$^[^_].+$$` to this `$$ ^ [^_].+ $$`.
by adding spaces the filter that uses the regex breaks damaging the results returned by the query. in my p…
-
https://github.com/pauldeschacht/pdfgrid/blob/master/doc.txt
WordPosition
------------
For each page, a list of WordPositions is extracted. Each WordPosition contains
* the pdf coordinates (TODO:…
-
**Is your feature request related to a problem? Please describe.**
I am trying to filter metadata with queries produced by a LLM. The issue is that there can be slight variations in capitalization, s…
-
**Is your feature request related to a problem? Please describe.**
When notes are non-atomic (mixed) type, it is better to have similarity suggestions with respect to current/each sentence.
**Des…
vasnt updated
6 months ago
-
[Repository](https://github.com/knoxdw/nanogenmo2018), [text](https://raw.githubusercontent.com/knoxdw/nanogenmo2018/master/makeshift.txt), and [pdf](https://raw.githubusercontent.com/knoxdw/nanogenmo…
-
## 0. Paper
Authors: Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
Year: 2018
ArXiv: [[link](https://arxiv.org/abs/1710.04087)]
## 1. What is it?
They build…
a1da4 updated
3 years ago
-
I am using bert-base-nli-stsb-mean-tokens model in an unsupervised fashion to get similarity between sentences.
It performs really good for some cases.
But on doing extensive analysis, I found some …