approach0 / search-engine

A math-aware search engine.
http://approach0.xyz
MIT License
347 stars 50 forks source link

Inside a docker container, a0 eats string following some utf-8 characters. #33

Closed w32zhong closed 3 years ago

w32zhong commented 3 years ago

With Debian debian:buster image. A0 will eat string following some utf-8 characters.

Example:

# docker run -it -p 8921:8921 -v `pwd`/../indexerd/tmp:/mnt/index a0 searchd.out -i /mnt/index -c0 -C0

$ curl -X POST http://localhost:8921/search --header "Content-Type: application/json" -d '{"ip":"127.0.0.1","page":1,"kw":[{"type":"tex","str":"1+2+\u2026+100"}]}'

Output:

[inverted lists]
[0] (level 2)   9.90 `1+2+' (TeX, upp=9.90, th=0.77)
        [  0] (on disk) prefix/ONE/SIGN (pf=4, ipf=4.90)
        [  1] (on disk) prefix/NUM/SIGN (pf=8, ipf=4.20)
        [  2] (on disk) prefix/ONE/SIGN/ADD (pf=4, ipf=4.90)
        [  3] (on disk) prefix/NUM/SIGN/ADD (pf=8, ipf=4.20)

Here the TeX string is wrong, it is expected to be `1+2+…+100' which is the actual behavior outside container.