freme-project / e-Internationalization

Apache License 2.0
0 stars 0 forks source link

Whitelabel Error Page error when submitting large HTML #27

Closed m1ci closed 8 years ago

m1ci commented 8 years ago

When submitting

curl -X POST --header "Content-Type: text/html" --header "Accept: text/html" -d @test.html "http://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=all" -v

with this file.

We get

<html><body><h1>Whitelabel Error Page</h1><p>This application has no explicit mapping for /error, so you are seeing this as a fallback.</p><div id='created'>Thu Oct 29 10:28:42 CET 2015</div><div>There was an unexpected error (type=Internal Server Error, status=500).</div><div>String index out of range: 701</div></body></html>
jnehring commented 8 years ago

I could reproduce the error. I traced it down to the e-Internationalization codes. So I created a unit test in e-Internationalization that reproduces the error. Remove the comments and run the tests to see the error.

@borriellom please take a look at the error and fix it.

borriellom commented 8 years ago

Fixed. I have also committed the file long-html-enriched.ttl for the unit test. I made some changes in the NIF generation process as well. This file has been generated by converting the HTML file to NIF and then by enriching it through the FREME NER service.

Please, check if it works now.

pheyvaer commented 8 years ago

Still not working when executing the code from @m1ci. It downloads a part of the result it seems, but then it stops and after a while you get a time out.

borriellom commented 8 years ago

The result is complete, I don't know why it tries to read further bytes. @jnehring do you think it could be an issue in the broker? Smaller files work fine.

jnehring commented 8 years ago

Today the bugfixing period of FREME 0.4 finished. I am currently preparing the release. We have to take this bug into the live version of FREME 0.4.

So we will fix this bug in FREME 0.5

jnehring commented 8 years ago

There was an additional bug in the broker that prevented long HTML from being properly processed. I created a bugfix. Now I can process @m1ci's request. Further I created an integration test using @m1ci's request.

@m1ci and @pheyvaer please test and close the issue in case it is resolved.

pheyvaer commented 8 years ago

Works for me.

m1ci commented 8 years ago

works for me too.