-
form Step 3 I trained my own dataset and obtained to json files
for Step 4. Format to Simpler Json Files: I get this error
> FileNotFoundError: [Errno 2] No such file or directory: '/content/dri…
-
I'm writing a Python script that mimics the behavior of lmplz.
When I tested it out on a large corpus, I found the estimated probabilities differed slightly from lmplz's output.
By shrinking the c…
-
Trace
```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/Users/George/miniconda3/envs/d4d-…
-
Pretraining BERT from `base` **requires the vocabulary** `vocab.txt`. Does this `vocab.txt` needs to be the exhaustive intersection vocabulary from the `base` and the **domain-specific corpus** we wou…
-
The current export options are for
* WARC
* WARC.gz
* WARC-with-resources
* WARC-with-resources.gz
* CSV
#245 suggests adding ZIP as an option and #233 calls for ways of restricting the reso…
tokee updated
11 months ago
-
I have first used tevatron to train DPR from bert-based-uncased:
```
python -m torch.distributed.launch --nproc_per_node=1 -m tevatron.driver.train \
--output_dir model_wq \
--dataset_name Tev…
-
This test blows up with "Required plugin [=inc::MyMetadata] isn't installed...."
```
use strict;
use warnings FATAL => 'all';
use Test::More;
use Dist::Zilla::Tester;
use Test::DZil;
my $tzil = Bui…
-
Original message:
> Hi all,
>
> I'm working on deployment of the bwmf tasks to factor real-world corpus. Here is are status and todos:
>
> **Data**. Now we have a sina news corpus dataset which has…
-
因为系统盘不足,所以我更改了路径,但是在运行
python -m flashrag.retriever.index_builder \
--retrieval_method e5 \
--model_path ./models/e5-base-v2 \
--corpus_path ./FlashRAG_datasets/retrieval-corpus/wiki-1…
-
Hi. I've just tried to compile the lmplz and faced with the Segmentation fault. Moreover, I was facing with the errors while installation KenLM with Boost version 1.65 which actually i could resolve. …