freme-project / freme-project.github.io

Apache License 2.0
2 stars 0 forks source link

Character encoding #55

Closed jnehring closed 8 years ago

jnehring commented 9 years ago

Data send to FREME via the POST body needs to be UTF-8 encoded. This ticket has two tasks

  1. Try to send UTF-8 data to all enrichment endpoints via POST body. A suitable string for testing is e.g.
Madrid (/məˈdrɪd/, Spanish: [maˈðɾið], locally: [maˈðɾiθ, -ˈðɾi]) is a south-western European city and the capital and largest municipality of Spain.

You should check that the special characters in the NIF response are not broken.

  1. Mention in the general section of the API doc that all content in the body needs to be UTF-8 encoded.
ArneBinder commented 8 years ago

Manually checked utf-8 content as content of input parameter and body via the documentation page for the following e-services (no problems encountered):

ArneBinder commented 8 years ago

Tried to implement an utf-8 content integration test in the Broker/integration_test branch, but it doesnt work. Could not figure out why, the same input works via the documentation page. @jnehring any ideas?

jnehring commented 8 years ago

@ArneBinder I will take a look but that might take some time. In the meantime please update the configuration as written above. When you are done please assign the issue to me as a reminder to take a look at the integration test.

jnehring commented 8 years ago

I checkout out the branch and ran the tests. Everything looks fine to me.

image

So can you please describe what went wrong?

I see that you only test for valid NIF response. I did not find code that tests if the special characters are correct in the response.

ArneBinder commented 8 years ago

I get this error:

INFO    2015-10-01 15:21:54,725 [main] eu.freme.broker.integration_tests.UTF8ContentTest  - 
@prefix nif:<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/document/1#char=0,149>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString "Madrid (/m??dr?d/, Spanish: [ma?ð?ið], locally: [ma?ð?i?, -?ð?i]) is a south-western European city and the capital and largest municipality of Spain.";
    nif:beginIndex "0"^^xsd:nonNegativeInteger;
    nif:endIndex "149"^^xsd:nonNegativeInteger;
    nif:sourceUrl <http://differentday.blogspot.com/2007_01_01_archive.html> .

INFO    2015-10-01 15:21:54,727 [main] eu.freme.broker.integration_tests.UTF8ContentTest  - Test e-entity/freme-ner/
ERROR   2015-10-01 15:21:54,740 [http-nio-8000-exec-6] eu.freme.broker.eservices.BaseRestController  - Request: http://localhost:8000/e-entity/freme-ner/documents raised 
org.apache.jena.atlas.AtlasException: java.nio.charset.MalformedInputException: Input length = 1
    at org.apache.jena.atlas.io.IO.exception(IO.java:206)
    at org.apache.jena.atlas.io.CharStreamBuffered$SourceReader.fill(CharStreamBuffered.java:77)
    at org.apache.jena.atlas.io.CharStreamBuffered.fillArray(CharStreamBuffered.java:154)
    at org.apache.jena.atlas.io.CharStreamBuffered.advance(CharStreamBuffered.java:137)
    at org.apache.jena.atlas.io.PeekReader.advanceAndSet(PeekReader.java:243)
    at org.apache.jena.atlas.io.PeekReader.init(PeekReader.java:237)
    at org.apache.jena.atlas.io.PeekReader.peekChar(PeekReader.java:159)
    at org.apache.jena.atlas.io.PeekReader.makeUTF8(PeekReader.java:100)
    at org.apache.jena.riot.tokens.TokenizerFactory.makeTokenizerUTF8(TokenizerFactory.java:41)
    at org.apache.jena.riot.RiotReader.createParser(RiotReader.java:138)
    at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:164)
    at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:859)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:255)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:241)
    at org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:70)
    at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:286)
    at eu.freme.broker.eservices.FremeNER.execute(FremeNER.java:139)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:137)
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:776)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:705)
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)
    at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:869)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:648)
    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
    at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:154)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at eu.freme.broker.security.ManagementEndpointAuthenticationFilter.doFilter(ManagementEndpointAuthenticationFilter.java:81)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at eu.freme.broker.security.AuthenticationFilter.doFilter(AuthenticationFilter.java:86)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:110)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:57)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:50)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
    at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192)
    at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at eu.freme.broker.tools.EInternationalizationFilter.doFilter(EInternationalizationFilter.java:84)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at eu.freme.broker.tools.CORSFilter.doFilter(CORSFilter.java:44)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at com.github.isrsal.logging.LoggingFilter.doFilterInternal(LoggingFilter.java:46)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:85)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:518)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1091)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:668)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1521)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1478)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.Reader.read(Reader.java:140)
    ... 100 more
DEBUG   2015-10-01 15:21:54,758 [http-nio-8000-exec-6] com.github.isrsal.logging.LoggingFilter  - Request: request id=7; content type=text/plain; charset=UTF-8; uri=/e-entity/freme-ner/documents?language=en&informat=turtle&dataset=dbpedia; payload=@prefix nif:<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/document/1#char=0,149>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString "Madrid (/m??dr?d/, Spanish: [ma?ð?ið], locally: [ma?ð?i?, -?ð?i]) is a south-western European city and the capital and largest municipality of Spain.";
    nif:beginIndex "0"^^xsd:nonNegativeInteger;
    nif:endIndex "149"^^xsd:nonNegativeInteger;
    nif:sourceUrl <http://differentday.blogspot.com/2007_01_01_archive.html> .

DEBUG   2015-10-01 15:21:54,759 [http-nio-8000-exec-6] com.github.isrsal.logging.LoggingFilter  - Response: request id=7; payload={
  "exception": "org.apache.jena.atlas.AtlasException",
  "path": "/e-entity/freme-ner/documents",
  "message": "java.nio.charset.MalformedInputException: Input length = 1",
  "error": "Internal Server Error",
  "status": 500,
  "timestamp": 1443705714757
}

java.lang.AssertionError
    at org.junit.Assert.fail(Assert.java:86)
    at org.junit.Assert.assertTrue(Assert.java:41)
    at org.junit.Assert.assertTrue(Assert.java:52)
    at eu.freme.broker.integration_tests.IntegrationTest.validateNIFResponse(IntegrationTest.java:98)
    at eu.freme.broker.integration_tests.UTF8ContentTest.testFremeNer(UTF8ContentTest.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
jnehring commented 8 years ago

@ArneBinder can you please try if this solves the problem?

https://maven.apache.org/plugins/maven-resources-plugin/examples/encoding.html

ArneBinder commented 8 years ago

no changes, still getting the error. Miraculously just testDBPediaSpotlight, testELink and testFremeNer are failing.

jnehring commented 8 years ago

I will take a look at this.

jnehring commented 8 years ago

Lets not follow this issue any longer. It is not so important and we never had any problems with UTF 8 encodings.