-
We currently use Moses tokenizer for alignments because it seems like it's a standard in the MT world and it's what OpusTrainer supports for detokenization (we will likely feed tokenized text to it to…
-
My laptop constantly runs out of .git folder in this repository. One of the consumers that can be easily fixed is the ICU. Can we just switch to system ICU? What is the point of building ICU from the …
-
Some teams/projects around the internet that may be interested in a clean/working icu build now that windows ships it
- [ ] [Bottles](https://docs.usebottles.com/contribute/missing-dependencies)
- […
-
### What happens?
`icu` tokenizer may not split words correctly when the text contains emojis, throwing index out of bound error or deadlocking.
### To Reproduce
Using Docker image `paradedb/para…
-
-
### Description
Ubuntu image is not built with required ICU packages needed for dotnet build:
Process terminated. Couldn't find a valid ICU package installed on the system. Set the configuration f…
-
The ICU files are checked for and downloaded before the GUI can appear, which makes the app seem to either not start or run very slowly. On a slow connection the user can be waiting for 30 seconds or …
-
I get an exception like this when I try to open an XML file with the XML editor from WST:
```
NoClassDefFoundError: com/ibm/icu/util/StringTokenizer
at org.eclipse.wst.sse.core.utils.StringUtils.…
-
Hi all, this is obviously a post-ATS consideration, but I would be interested in having the location_category mapped to distinguish different types of ICUs (Medical ICU (could combine with CCUs as wel…
-
Hi, I've just tried following the “readme” instructions but nothing happens on intervals.icu. I'm a beginner in code, but I have the impression that I've done everything right. Is it possible to have …