Inist-CNRS / web-services

Web services at Inist-CNRS
https://services.istex.fr
5 stars 0 forks source link

[pdf-text] 1.1.3: spacy import broken #119

Closed parmentf closed 1 month ago

parmentf commented 1 month ago

Test of pdf-text fails.

$ npx hurl --test --variable host="http://192.168.128.151:49297" services/pdf-text/tests.hurl 
services/pdf-text/tests.hurl: Running [1/1]
error: Assert status code
  --> services/pdf-text/tests.hurl:8:6
   |
 8 | HTTP 200
   |      ^^^ actual value is <400>
   |

services/pdf-text/tests.hurl: Failure (1 request(s) in 685 ms)
--------------------------------------------------------------------------------
Executed files:  1
Succeeded files: 0 (0.0%)
Failed files:    1 (100.0%)
Duration:        686 ms

The server's log contains:

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Traceback (most recent call last):
  File "/app/public/./v1/pdf2txt.py", line 11, in <module>
    import spacy

So it seems it's a spacy problem (or more likely a numpy version problem, see #133, which fixed #117).