Closed AravindRam closed 8 years ago
Hey @AravindRam Your setup looks alright. Seems like NER wasn't able to pick locations from your index. Can I ask why do you have only Geographic_LATITUDE, Geographic_LONGITUDE, Geographic_NAME in your solr index?
Seems like you already ran geotopic parser on your data and index you are pointing stores result of geotopic parsing. Well GeoParser UI should still process your index but I think NER must have failed to extract locations from this sparse index.
Thanks Madhav
Hey @smadha
We actually wanted to try and check whether the geoparser works with these fields or if it looks for some specific field names so that we'll push the correct field names while creating the actual index for the assignment.
Also, we had run it using the memex collection index which we created in the previous semester. It still gave the same error. So, I don't think the problem would be because of having a sparse index.
I have attached the new_core.zip here for your reference. You can also have a look into it and let me know what could the possible issue be.
Thanks
Our team is also facing same issue Though fields are present in solr index (at port 8984) ( solr_index_response.json.txt ) "Geographic_LONGITUDE": "-117.15726", "Geographic_NAME": "San Diego", "Optional_LATITUDE1": "37.25022", "Optional_LONGITUDE1": "-119.75126", "Geographic_LATITUDE": "32.71533", "Optional_NAME1": "California",
Memex GeoParser couldnt parse points as it loaded in its solr (at port 8983) "points": [ "[]" ], "id": "10.1919/edu/universityofcalifornia/libraries/88E212AE3BDDB31E83957348F9F373546F77CF212F5BA6917C23CD2F37959878", "version": 1530436911583199200 }, memex_solr_index_response.json.txt
We experimented with 2 of our sample files
Please guide us, in case we are heading a wrong way
Thanks
@AravindRam I tried your zipped core and I could get 1 location out of it.
I pushed one commit today, can you pull and try again
Thanks
@AravindRam @raviraju Can you guys also please ensure that tika server is running fine on 8001 port and you can see proper results as mentioned in https://wiki.apache.org/tika/GeoTopicParser
curl -T $HOME/src/geotopicparser-utils/geotopics/polar.geot -H "Content-Disposition: attachment; filename=polar.geot" http://localhost:8001/rmeta
This was the issue for some of the people.
Thanks @smadha , Yes Tika Server is running on port 8001 ravirajukrishna@ubuntu:~$ curl -T $HOME/src/geotopicparser-utils/geotopics/polar.geot -H "Content-Disposition: attachment; filename=polar.geot" http://localhost:8001/rmeta [{"Content-Type":"application/geotopic","X-Parsed-By":["org.apache.tika.parser.DefaultParser","org.apache.tika.parser.geo.topic.GeoParser"],"X-TIKA:parse_time_millis":"5","resourceName":"polar.geot"}]ravirajukrishna@ubuntu:~$
We shall try with your latest commit
@smadha
The tike server is running properly on port 8001 and I tried testing it using the curl call as well. It returns the expected output as the per the wiki link.
I tried running it again with the latest commit. The problem still seems to persist.
@raviraju : Were you able to get over the error after running with the latest commit?
@raviraju i can't see locations in your response.
@AravindRam Lets connect on hangouts. Can you ping me at msharan@usc.edu?
Yes sure.
@smadha
The problem did not get resolved when I tried running after adding space between the fields (latest commit)
Well I tried indexing your core and it works fine for me. You need to debug this function https://github.com/MBoustani/GeoParser/blob/master/geoparser_app/views.py#L252
Can you start from scratch and share all your server logs through pastebin?
@AravindRam as we discussed please make sure your tika server is running at 8001 and is able to extract locations.
@smadha
Like you said, the problem was with the port number only. Instead, I changed the port number to 9998 in the config.txt . It was then able to extract the locations.
Thanks a lot for the help. I appreciate it :) :+1:
Hi @smadha ,
I tried to geoparse using a sample index containing 6 documents. The following is the screenshot which shows the index having 6 documents. The core which I have created is new_core and Solr 4.10 is running on port number 8984.
The solr index path which is passed to the geoparser is http://localhost:8984/solr/new_core/ which can be seen in the screenshot attached.
Once I geoparse it, it says Solr has successfully created the index but it does not retrieve any geo locations which are present in the index. I have attached the screenshot of the error which gets thrown in the console.
If you notice, it says it has geotagged 6 points but it throws a KeyError : "response"
When I tried printing the response to the request url, it returns the following
{'responseHeader': {'status': 400, 'QTime': 1, 'params': {'q': '-points:"[]"', 'start': '0', 'rows': '50000', 'fl': 'points,id', 'wt': 'json'}}, 'error': {'msg': 'undefined field points', 'code': 400}}
Can you please let me know what is causing this problem? I tried checking the configuration files. The Solr url and port number (8983) are correct and also the Tika Server is also running properly.