-
Hi,
I'm running Bitextor version 8.2 in a Conda environment (on Ubuntu in WSL) and I'm getting errors in connection with Hunalign and Bicleaner. If I set ftfy, cleanHTML, html5lib, boilerplateClea…
-
Hi,
I'm running Bitextor version 8.2 in a conda environment in Linux and I'm using Python 3.8. I'm using externalMT for document alignment
and bleualign for sentence alignment, and I'm getting t…
-
### 问题类型 / Type of issues
* 其它 / other
----
这是继 #2364 之后的又一次大扫除。请各位按列表检查自己名下的包,如果有意继续维护就勾选掉,如果不想管了、不记得了、好像没什么用了之类的,就留在那里。一个月之后还没有被勾选的包,我会发起 orphaning 流程。记得可以直接编辑留言,而不必用鼠标逐个点击。
包名后边的数字,前一个是…
-
I am running on Ubuntu 20.04 with Ruby 2.7.2 / cld3-ruby 3.4.3 on WSL and get an segmentation fault, wenn (by accident) I passed identical values to min and max to the constructor `CLD3::NNetLanguageI…
-
When @ColinFay runs the test suite the following tests failed. These tests pass on my local machine and also in GitHub Actions. We need to improve the tests so that they are more robust. Specifically:…
-
I was investigating a chunk of text being wrongly labeled as `uk` (Ukranian), when it was `el` (Greek). It turned out that the misclassification just happens with certain values of `MAX_NUM_BYTES_TO_C…
-
Hi, Python 3.10 seems to require a different syntax for wheel building (PEP 517).
I just caught the following error on Travis, see this log (https://travis-ci.org/github/adbar/trafilatura/jobs/7152…
adbar updated
2 years ago
-
Could you add the following documentation under the `platform specific` section in the `readme`?
> On Ubuntu/Debian to install the necessary `protobuf` dependencies run:
>
> ```
> apt install…
-
The documentation mentions the constant `MAX_NUM_INPUT_BYTES_TO_CONSIDER` as the maximum number of bytes which are processed for `find_top_n_most_freq_langs`, but it seems the variable isn't really us…
-
### Behaviour
#### Steps to reproduce this issue
1. Create/build an image because we want to test if the changes pass (following https://github.com/docker/build-push-action/blob/master/docs/adva…