freme-project / freme-ner

Apache License 2.0
6 stars 1 forks source link

code in <head> not correct after NER #163

Closed pheyvaer closed 7 years ago

pheyvaer commented 7 years ago

Request:

curl -X POST --header 'Content-Type: text/html' --header 'Accept: text/html' -d '<!DOCTYPE HTML>
<html>
    <head>
        <title>Leipzig</title>
        <meta charset="UTF-8">
        <link rel="stylesheet" type="text/css" href="media/style.css">
    </head>
    <body>
        <figure><img src="media/File_Leipzig_Fockeberg_Zentrum.jpg" alt="Leipzig Fockeberg Zentrum"></figure>
        <main>
        <h1>Leipzig</h1>
        <p>Leipzig (/ˈlaɪpsɪɡ/; German: [ˈlaɪptsɪç]) is the largest city in the federal state of Saxony, Germany. With a population of 570,087 inhabitants (1,001,220 residents in the larger urban zone) it is Germany\u0027s tenth most populous city. Leipzig is located about 160 kilometers (99 miles) southwest of Berlin at the confluence of the White Elster, Pleisse, and Parthe rivers at the southern end of the North German Plain.</p>
        <p>Leipzig has been a trade city since at least the time of the Holy Roman Empire. The city sits at the intersection of the Via Regia and Via Imperii, two important Medieval trade routes. Leipzig was once one of the major European centers of learning and culture in fields such as music and publishing. Leipzig became a major urban center within the German Democratic Republic (East Germany) after World War II, but its cultural and economic importance declined despite East Germany being the richest economy in the Soviet Bloc.</p>
        </main>
    </body>
</html>' 'https://api.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=spot%2Clink'

output:

<html>
 <head> 
  <title>Leipzig</title> 
 </head> 
 <body>
   &gt; 
  <link href="media/style.css" rel="stylesheet" type="text/css"> 
  <figure> 
   <img alt="Leipzig Fockeberg Zentrum" src="media/File_Leipzig_Fockeberg_Zentrum.jpg"> 
  </figure> 
  <main> 
   <h1>Leipzig</h1> 
   <p>Leipzig (/ˈlaɪpsɪɡ/; German: [ˈlaɪptsɪç]) is the largest city in the federal state of <span its-ta-confidence="0.9958755498461204" its-ta-ident-ref="http://dbpedia.org/resource/Saxony">Saxony</span>, Germany. With a population of 570,087 inhabitants (1,001,220 residents in the larger urban zone) it is Germany's tenth most populous city. Leipzig is located about 160 kilometers (99 miles) southwest of <span its-ta-confidence="0.9970024931402685" its-ta-ident-ref="http://dbpedia.org/resource/Berlin">Berlin</span> at the confluence of the <span its-ta-confidence="0.8995896545652485" its-ta-ident-ref="http://dbpedia.org/resource/White_Elster">White Elster</span>, <span its-ta-confidence="0.9251154157196518" its-ta-ident-ref="http://dbpedia.org/resource/Pleiße">Pleisse</span>, and <span its-ta-confidence="0.9894714843819582" its-ta-ident-ref="http://dbpedia.org/resource/Parthe">Parthe</span> rivers at the southern end of the <span its-ta-confidence="0.9989857046025482" its-ta-ident-ref="http://dbpedia.org/resource/North_German_Plain">North German Plain</span>.</p> 
   <p>Leipzig has been a trade city since at least the time of the <span its-ta-confidence="0.9300453574097621" its-ta-ident-ref="http://dbpedia.org/resource/Holy_Roman_Empire">Holy Roman Empire</span>. The city sits at the intersection of the <span its-ta-confidence="0.9329751389202003" its-ta-ident-ref="http://dbpedia.org/resource/Via_Regia">Via Regia</span> and <span its-ta-confidence="0.6244577603818485" its-ta-ident-ref="http://dbpedia.org/resource/Via_Imperii">Via Imperii</span>, two important <span its-ta-confidence="0.47309400393377643" its-ta-ident-ref="http://dbpedia.org/resource/Middle_Ages">Medieval</span> trade routes. Leipzig was once one of the major <span its-ta-confidence="0.5553788656024554" its-ta-ident-ref="http://dbpedia.org/resource/Ethnic_groups_in_Europe">European</span> centers of learning and culture in fields such as music and publishing. Leipzig became a major urban center within the German Democratic Republic (East Germany) after <span its-ta-confidence="0.9985759855690182" its-ta-ident-ref="http://dbpedia.org/resource/World_War_II">World War II</span>, but its cultural and economic importance declined despite East Germany being the richest economy in the <span its-ta-confidence="0.45008612387273" its-ta-ident-ref="http://dbpedia.org/resource/Eastern_Bloc">Soviet Bloc</span>.</p> 
  </main>  
 </body>
</html>

You have that the <link> is now in the <body>, <meta> is missing and &gt; is added in <body>.

Don't know if this is related to NER or e-Internationalization or both.

katia-vistatec commented 7 years ago

Hi, this error is not present if you use api-dev. Indeed using the curl request attached I got the result.txt file in which you do not have the error mentioned. inputDoc.txt curlRequest.txt result.txt

jnehring commented 7 years ago

It will be installed on freme-live in next release: https://github.com/freme-project/FREMECommon/issues/41

jnehring commented 7 years ago

I think this is fixed on live also. @pheyvaer please check and close the issue when it is solved.

pheyvaer commented 7 years ago

No, this is what I get.

<!DOCTYPE html> 
<html> 
 <head> 
 </head> 
 <body>
   charset= 
  <span its-ta-confidence="0.7784975536946784" its-ta-ident-ref="http://dbpedia.org/resource/UTF-8">UTF-8</span>&gt; k rel=stylesheet type=text/css href=media/style.css&gt; src=media/File_Leipzig_Fockeberg_Zentrum.jpg alt= 
  <span its-ta-confidence="0.4803560645840885">Leipzig Fockeberg Zentrum</span>&gt; &gt; (/ˈlaɪpsɪɡ/; German: [ˈlaɪptsɪç]) is the largest city in the federal state of 
  <span its-ta-confidence="0.9958755498461487" its-ta-ident-ref="http://dbpedia.org/resource/Saxony">Saxony</span>, Germany. With a population of 570,087 inhabitants (1,001,220 residents in the larger urban zone) it is Germany\u0027s tenth most populous city. Leipzig is located about 160 kilometers (99 miles) southwest of 
  <span its-ta-confidence="0.9970024931402685" its-ta-ident-ref="http://dbpedia.org/resource/Berlin">Berlin</span> at the confluence of the 
  <span its-ta-confidence="0.8995896545652485" its-ta-ident-ref="http://dbpedia.org/resource/White_Elster">White Elster</span>, 
  <span its-ta-confidence="0.9251154157196518" its-ta-ident-ref="http://dbpedia.org/resource/Pleiße">Pleisse</span>, and 
  <span its-ta-confidence="0.9894714843819582" its-ta-ident-ref="http://dbpedia.org/resource/Parthe">Parthe</span> rivers at the southern end of the 
  <span its-ta-confidence="0.9989857046025482" its-ta-ident-ref="http://dbpedia.org/resource/North_German_Plain">North German Plain</span>. 
  <p>has been a trade city since at least the time of the <span its-ta-confidence="0.8367624565915283" its-ta-ident-ref="http://dbpedia.org/resource/Holy_Roman_Empire">Holy Roman Empire</span>. The city sits at the intersection of the <span its-ta-confidence="0.9329751389202003" its-ta-ident-ref="http://dbpedia.org/resource/Via_Regia">Via Regia</span> and <span its-ta-confidence="0.6244577603818485" its-ta-ident-ref="http://dbpedia.org/resource/Via_Imperii">Via Imperii</span>, two important <span its-ta-confidence="0.47309400393377643" its-ta-ident-ref="http://dbpedia.org/resource/Middle_Ages">Medieval</span> trade routes. Leipzig was once one of the major <span its-ta-confidence="0.7364624100740034" its-ta-ident-ref="http://dbpedia.org/resource/European_Union_Centers_of_Excellence">European centers</span> of learning and culture in fields such as music and publishing. Leipzig became a major urban center within the German Democratic Republic (East Germany) after <span its-ta-confidence="0.9985759855690182" its-ta-ident-ref="http://dbpedia.org/resource/World_War_II">World War II</span>, but its cultural and economic importance declined despite East Germany being the richest economy in the <span its-ta-confidence="0.45008612387273" its-ta-ident-ref="http://dbpedia.org/resource/Eastern_Bloc">Soviet Bloc</span>.</p> 
  <p> &gt; </p> 
 </body> 
</html>

Now, it even did NER on the information that is (originally) in <head>.

jnehring commented 7 years ago

Now, it even did NER on the information that is (originally) in <head>.

It looks more like it strips the content of the head.

Also I wonder about the line <p> &gt; </p> (third last line). Where does this come from?

Is this critical for the SWIB conference?

pheyvaer commented 7 years ago

Only if people actually try this ;)

katia-vistatec commented 7 years ago

I was trying to check this but I got an error (the same as in https://github.com/freme-project/freme-ner/issues/165)

fsasaki commented 7 years ago

This also influences other demos, it breaks e.g. call of e-Entity at https://api.freme-project.eu/ckeditor/ckeditor/samples/freme.html probably also in relation to https://github.com/freme-project/freme-ner/issues/165

sandroacoelho commented 7 years ago

Hi @fsasaki ,@pheyvaer, @katia-vistatec

The FREME-NER was not running on the server. I have started it right now and will check why the process has stopped . Could please check if it is ok now?

Thanks,

katia-vistatec commented 7 years ago

My request: curl -X POST --header "Content-Type: text/html" --header "Accept: text/html" --data "@inputDoc.txt" "http://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=spot%2Clink" > result.txt

I get the same error:

{ "exception": "eu.freme.bservices.filter.proxy.exception.BadGatewayException", "path": "/e-entity/freme-ner/documents", "message": "Proxy failed: org.apache.http.conn.HttpHostConnectException: Connect to rv2622.1blu.de:7001 [rv2622.1blu.de/178.254.20.8] failed: Connection refused", "error": "Bad Gateway", "status": 502, "timestamp": 1480076239909 }

pheyvaer commented 7 years ago

The example that I initially posted works now.

sandroacoelho commented 7 years ago

Thank you @pheyvaer.

@katia-vistatec : rv2622.1blu.de:7001 is freme-ner-dev. I started now. Could you please check again?

fsasaki commented 7 years ago

The call to FREME NER with type "all" works now, with "link" it still does not work, see the request below.

curl -X POST --header 'Content-Type: text/plain' --header 'Accept: text/turtle' -d 'Berlin' ' https://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=link '

2016-11-25 13:14 GMT+01:00 Sandro notifications@github.com:

Hi @fsasaki https://github.com/fsasaki ,@pheyvaer https://github.com/pheyvaer, @katia-vistatec https://github.com/katia-vistatec

The FREME-NER was not running on the server. I have started it right now and will check why the process has stopped . Could please check if it is ok now?

Thanks,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/freme-project/freme-ner/issues/163#issuecomment-262944866, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH5AlZ0eHPTC4mXMcsVawv6xL8PbxxDks5rBtFAgaJpZM4Kzur1 .

sandroacoelho commented 7 years ago

@fsasaki : I will try update with "link" option. Give me 10 minutes more.

katia-vistatec commented 7 years ago

Also for me it is working result-dev.txt result-live.txt

sandroacoelho commented 7 years ago

@fsasaki : mode spot, link, classify enabled in FREME-NER dev. Could you please test again?

Thanks

fsasaki commented 7 years ago

Works, thanks a lot!

2016-11-25 13:49 GMT+01:00 Sandro notifications@github.com:

@fsasaki https://github.com/fsasaki : mode spot, link, classify enabled in FREME-NER dev. Could you please test again?

Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/freme-project/freme-ner/issues/163#issuecomment-262951679, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH5Aozw684Q6sNRbIbZkq8zm3el7Wweks5rBtlkgaJpZM4Kzur1 .

jnehring commented 7 years ago

This is fixed. It works on live also, although there was no release. So I close the issue.