Closed lwj2001 closed 3 weeks ago
For your error, it seems to be due to your insufficient memory. You can try reducing the batch size and max_workers (you can set to 1 for testing).
The warning in building wiki corpus is raised by wikiextractor
, it is normal (see similar issue in wikiextractor: https://github.com/attardi/wikiextractor/issues/33). I guess it's possible that wikiextractor uses some templates to parse wiki pages, but some pages couldn't match. The number of such pages is very small, so it shouldn't matter.
我遵循了
docs/process-wiki.md
中的步骤,一步一步构建索引,最终在step3中报错了 报错日志:似乎是在开始chunking之后报错了。 想请问您:
WARNING: Template errors in article 'Jarosław Olech' (56462895): title(1) recursion(0, 0, 0)
对build index有什么影响吗?