freme-project / e-Internationalization

Apache License 2.0
0 stars 0 forks source link

Roundtripping for long html #9

Closed jnehring closed 8 years ago

jnehring commented 8 years ago

I was playing around with e-Internationalization on normal data. I took the html source of http://vistatec.ie and send it to FREME NER. When I use informat=text/html and outformat=turtle (conversion from html to NIF) then the call is very fast. But when I use informat=text/html and outformat=text/html (html roundtripping) then the api call gets a timeout after 30 seconds.

In the branch "long-roundtripping" I created a unit test for long roundtripping. It performs above mentioned conversion from NIF to HTML. I started it and it did not respond for 10 minutes.

For small html pages like <p>Welcome to Berlin</p> the roundtripping works but for a normal real world HTML it seems to be too slow.

borriellom commented 8 years ago

It's not too slow. There was a bug and with this html file it executed an infinite loop at some point. I also fixed other small issues. Changes have been committed in the master branch. While fixing the bug I found out a new issue: #11

jnehring commented 8 years ago

I tested it and now it works fine and fast. Thank you!