opencitations / bcite

A bibliographic reference correction service
17 stars 2 forks source link

Problems with Installation #4

Open zuphilip opened 6 years ago

zuphilip commented 6 years ago

I tried to install bcite but didn't succeed. The online app shows for any input

<class 'IndexError'> at /
list index out of range

and points to https://github.com/opencitations/bcite/blob/63d87e6191db7e80c3445a629618bd2fb9964106/script/web/app.py#L104 but I guess that the problem here is that response=[]. Any idea what could be the problem?


Then I looked into python3 script/api/test/bciteapi.py but couldn't get this working. What I faced so far with this call:

marilenadaquino commented 6 years ago

ATM bcite is a prototype and is supposed to run locally, hence I guess the problem is related to the absolute path of the API that you call twice in app.py

line 102: request = requests.get('http://localhost:8000/api/citing/'+str(ts)+'/'+urllib.parse.quote(str(citingEntityEncoded)))

line 119: request = requests.get('http://localhost:8000/api/reference/'+str(web.input().time)+'/'+web.input().idRef+'/'+web.input().style+'/'+urllib.parse.quote(referenceText) )

there is a third call in main.js request.open('GET', 'http://localhost:8000/api/store/'+timestamp+'/'+$(this).attr('data-update')+'/'+citing+'/'+$(this).attr('id')+'/'+encodeURIComponent($(this).text()), true);

I see you have problems with all the paths actually in your online instance (I can't see the graphics). We'll fix it soon to run everywhere with a local path. In the meantime try to modify your absolute paths.

zuphilip commented 6 years ago

Thank you for your response. I tried to change all paths but the result is still the same. Maybe also note that on the server the API is running on localhost:8000 but this port is not exposed to the outside world but rather proxyfied to a subdirectory. Thus, something like curl localhost:8000 works fine on the server and I guess that also Python can access it as well.

Is there an easy way to test that the API is working as expected? Maybe a short example request?

marilenadaquino commented 5 years ago

Hi, I tried to reproduce the errors you found, and I got an error because a rdflib plugin (json-ld serialiser) was missing. I updated the readme with suggestions (including the suggested python version) and the requirements.txt file. Plus, I removed some mandatory fields in the form.

Second, script/api/bciteapi.py is the file you should look at.

A complete workflow requires the API to be called three times, and works only if you input:

The last call is to confirm input data are correct and verified by the user.

Hence I cannot paste here a short example request. Hope it works for you now. Let us know!

zuphilip commented 5 years ago

Thank you. I switched to the new version and updated the python libraries. I don't see any error now, but the result is simply None. I guess that I need to have something in the triplestore before? I don't understand how the API works...

marilenadaquino commented 5 years ago

Did the triplestore start correctly? Can you please paste here the output in the shell?

zuphilip commented 5 years ago

Which output? (The webservice outputs for me just with None.)

marilenadaquino commented 5 years ago

Just to double-check.

what the shell says at this point?

zuphilip commented 5 years ago

Ah, I see. I run it on a server where I only have ssh access, therefore I start the app with nohup python3 -m script.web.app 8000 & in the background. Here is the output of python for the action:

...:~/bcite$ python3 -m script.web.app 8000
http://0.0.0.0:8000/
127.0.0.1:55974 - - [02/Oct/2018 10:59:54] "HTTP/1.1 GET /" - 200 OK
127.0.0.1:55976 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/js/main.js" - 200
127.0.0.1:55976 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/js/Blob.min.js" - 200
127.0.0.1:55974 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/js/FileSaver.js" - 200
127.0.0.1:55974 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/js/mark.min.js" - 200
127.0.0.1:55978 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/js/xlsx.core.min.js" - 200
127.0.0.1:55978 - - [02/Oct/2018 10:59:55] "HTTP/1.1 GET /static/css/tableexport.css" - 200
127.0.0.1:55978 - - [02/Oct/2018 10:59:57] "HTTP/1.1 GET /static/favicon.ico" - 200
127.0.0.1:55980 - - [02/Oct/2018 11:00:19] "HTTP/1.1 POST /" - 200 OK

However, I see that the java process outputs that the service url is my web adress and not localhost. There are also two warnings:

WARN : NanoSparqlServer.java:517: Starting NSS
WARN : ServiceProviderHook.java:171: Running.
serviceURL: http://134.155.108.51:9999

How should that look like?

zuphilip commented 5 years ago

I give it another attempt with docker, see https://github.com/zuphilip/bcite/commit/fe77b9f1c7c9f6efa4375e3d16a4964e1facee5b but this leads me to the same problem(s)...

zuphilip commented 5 years ago

Okay, I have never filled out the ORCID field and this is currently a problem, but can be fixed with #7. But now I see again the error from the beginning. The console shows then:

<web.form.Form object at 0x7fa82ed48198>
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:8000
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.crossref.org
:443
DEBUG:urllib3.connectionpool:https://api.crossref.org:443 "GET /works/10.1017/S0
018246X06005966 HTTP/1.1" 200 1634
[SPACIN CrossrefProcessor - INFO] Data retrieved from 'https://api.crossref.org/
works/10.1017/S0018246X06005966'.
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:9999
DEBUG:urllib3.connectionpool:http://localhost:9999 "GET /blazegraph/sparql?query
=PREFIX%20fabio%3A%20%3Chttp%3A//purl.org/spar/fabio/%3E%0ASELECT%20%3Fid%0AWHER
E%20%7B%0A%20%20%20%20BIND%20%28%3Chttp%3A//localhost%3A8000/corpus/br/17%3E%20A
S%20%3Fbr%29%20.%0A%20%20%20%20%3Fbr%20a%20fabio%3AExpression%20.%0A%20%20%20%20
BIND%20%28strafter%28str%28%3Fbr%29%2C%20%22/corpus/%22%29%20AS%20%3Fid%29%20.%0
A%7D%0ALIMIT%201 HTTP/1.1" 200 None
127.0.0.1:38606 - - [04/Oct/2018 16:48:38] "HTTP/1.1 GET /api/citing/1538671713.
103041/{"author": ["Berners-Lee, Tim"], "title": "Title", "journal": "Journal",
"volume": "18", "issue": "1", "year": "2009", "publisher": "Wiley", "doi": "10.1
017/S0018246X06005966"}" - 200 OK
DEBUG:urllib3.connectionpool:http://localhost:8000 "GET /api/citing/1538671713.1
03041/%7B%22author%22%3A%20%5B%22Berners-Lee%2C%20Tim%22%5D%2C%20%22title%22%3A%
20%22Title%22%2C%20%22journal%22%3A%20%22Journal%22%2C%20%22volume%22%3A%20%2218
%22%2C%20%22issue%22%3A%20%221%22%2C%20%22year%22%3A%20%222009%22%2C%20%22publis
her%22%3A%20%22Wiley%22%2C%20%22doi%22%3A%20%2210.1017/S0018246X06005966%22%7D H
TTP/1.1" 200 None
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/web/application.py", line 257, in
 process
    return self.handle()
  File "/usr/local/lib/python3.6/site-packages/web/application.py", line 248, in
 handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/local/lib/python3.6/site-packages/web/application.py", line 488, in
 _delegate
    return handle_class(cls)
  File "/usr/local/lib/python3.6/site-packages/web/application.py", line 466, in
 handle_class
    return tocall(*args)
  File "/usr/src/app/script/web/app.py", line 87, in POST
    idCitingRef = response[0]['id']
IndexError: list index out of range

192.168.99.1:55885 - - [04/Oct/2018 16:48:38] "HTTP/1.1 POST /" - 500 Internal Server Error
192.168.99.1:55885 - - [04/Oct/2018 16:48:39] "HTTP/1.1 GET /favicon.ico" - 404 Not Found
atomotic commented 5 years ago

same behaviour for me. to avoid data entry with the form i can reproduce the post this way

pip3 install httpie
cat data.txt
author=Not,Exists
&title=NO
&ORCID=0000-0001-8770-4972
&references=Gibby,+R.,+and+Brazier,+C.+2012.+Observations+on+the+development+of+non-print+legal+deposit+in+the+UK.+Library+Review+61,+5+(2012),+362-377.+DOI=https://doi.org/10.1108/00242531211280487
&style=MLA
&form_action=search
http --form POST http://localhost:8000/ < data.txt
HTTP/1.1 200 OK
Date: Fri, 05 Oct 2018 09:51:55 GMT
Server: localhost
Transfer-Encoding: chunked

None
marilenadaquino commented 5 years ago

I see, I tried again and everything works fine. I get that error only when I do not include the DOI of the citing article, which is a required field for creating the KG.

marilenadaquino commented 5 years ago

I committed a change in app.py. ORCID is removed (we do not need it at this stage), and the rendering of the page works correctly now - so you won't see None anymore, but the form fields missing required values highlighted.

zuphilip commented 5 years ago

Validation works for me now, but i see the same error as in the very beginning of this issue. It seems that the API request in line 81 is empty: https://github.com/opencitations/bcite/blob/3611deba94cdda67320075b2ddc17856f99986b1/script/web/app.py#L81

i.e. after this line I have request=[], which then leads to the error some lines later.