The Geoparser is a software tool that can process information from any type of file, extract geographic coordinates, and visualize locations on a map. Users who are interested in seeing a geographical representation of information or data can choose to search for locations using the Geoparser, through a search index or by uploading files from their computer. The Geoparser will parse the files and visualizes cities or latitude-longitude points on the map. After the information is parsed and points are plotted on the map, users are able to filter their results by density, or by searching a key word and applying a "facet" to the parsed information. On the map, users can click on location points to reveal more information about the location and how it is related to their search.
docker build -t nasajplmemex/geo-parser --no-cache -f Dockerfile .
docker-compose up -d
http://localhost:8000
on your browserGeoParser has been updated with a new easy to use Docker install, and also an example to download and run the COVID-19 literature data and view the locations. Use that example to explore and test out GeoParser on a real example and view locations from that dataset.
pip install -r requirements.txt
Run Solr Change directory to where you cloned the project
cd Solr/solr-5.3.1/
./bin/solr start
Clone lucene-geo-gazetteer repo
git clone https://github.com/chrismattmann/lucene-geo-gazetteer.git
cd lucene-geo-gazetteer
mvn install assembly:assembly
add lucene-geo-gazetteer/src/main/bin to your PATH environment variable
make sure it is working
lucene-geo-gazetteer --help
usage: lucene-geo-gazetteer
-b,--build <gazetteer file> The Path to the Geonames
allCountries.txt
-h,--help Print this message.
-i,--index <directoryPath> The path to the Lucene index
directory to either create or read
-s,--search <set of location names> Location names to search the
Gazetteer for
You will now need to build a Gazetteer using the Geonames.org dataset. (1.2 GB)
cd lucene-geo-gazetteer
curl -O http://download.geonames.org/export/dump/allCountries.zip
unzip allCountries.zip
lucene-geo-gazetteer -i geoIndex -b allCountries.txt
make sure it is working
lucene-geo-gazetteer -s Pasadena Texas
[
{"Texas" : [
"Texas",
"-91.92139",
"18.05333"
]},
{"Pasadena" : [
"Pasadena",
"-74.06446",
"4.6964"
]}
]
Now start lucene-geo-gazetteer server
lucene-geo-gazetteer -server
Run tika server as mentioned in https://cwiki.apache.org/confluence/display/TIKA/GeoTopicParser
on port 8001
.
Port can be configured via config.txt
Make sure you can extract locations from Tika Server
curl -T /path/to/polar.geot -H "Content-Disposition: attachment; filename=polar.geot" http://localhost:8001/rmeta
You can obtain [file here] (https://raw.githubusercontent.com/chrismattmann/geotopicparser-utils/master/geotopics/polar.geot)
Output should be this
[
{
"Content-Type":"application/geotopic",
"Geographic_LATITUDE":"39.76",
"Geographic_LONGITUDE":"-98.5",
"Geographic_NAME":"United States",
"Optional_LATITUDE1":"27.33931",
"Optional_LONGITUDE1":"-108.60288",
"Optional_NAME1":"China",
"X-Parsed-By":[
"org.apache.tika.parser.DefaultParser",
"org.apache.tika.parser.geo.topic.GeoParser"
],
"X-TIKA:parse_time_millis":"1634",
"resourceName":"polar.geot"
}
]
Run Django server
python manage.py runserver
Open in browser http://localhost:8000/ Note : Please refer to the wiki page on this github repository which can act as a guide for you on how to use GeoParser.