-
Hi, I'm trying to get xrenner to work, but I run into problems with the tokenizer from the `transformers` package. Here is the code I'm trying to run:
```
import xrenner
data = """
1 The the D…
-
Run the notebook with the properly downloaded dataset, but encounter the following error when fitting the model
Not in vocab
Not in vocab
Not in vocab
Not in vocab
Not in vocab
Not in vocab
N…
-
This error yield out when I usse textaugment with gensim 4 but not gensim 3
```
File "aug.py", line 15, in
data_df['paraphrased_text'] = data_df['text'].progress_apply(lambda x: w2v.augment(…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
## Tracking integration of task - text-nearest
Naming is tentative
Note that you're not expected to do all of the following steps. This PR helps track all the steps required to get a new task f…
-
I am unable to replicate the results for the MS MARCO passage subset experiment for monoT5 [in this section](https://github.com/castorini/pygaggle/blob/master/docs/experiments-msmarco-passage-subset.m…
-
`model_path = "./models/en/ft_cc.en.300_freqprune_50K_5K_pq_100.bin"
big_model = gensim.models.fasttext.FastTextKeyedVectors.load(model_path)
small_model = compress_fasttext.prune_ft_freq(big_m…
-
Given a text document (PDF, Markdown, etc.) of any length, we need to extract a list of relevant topics, perhaps 5-20 or more.
This is to help us generate question/answer keypairs programmatically …
-
I am Using bwv = BengaliWord2Vec(). But unable to get the vocabulary length.
-
I am trying to reproduce `ft_cc.en.300_freqprune_50K_5K_pq_100.bin` model from fasttext original model.
This is my code:
```
org_model_path = 'cc.en.300.bin'
print(fasttext.util.download_mod…