fhamborg / Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Apache License 2.0
505 stars 87 forks source link

move nominatim to preprocessing #4

Open fhamborg opened 6 years ago

fhamborg commented 6 years ago

we initially had the nominatim queries directly in the environment extractor, to spare some requests. however, as shown in https://github.com/bkrrr/Giveme5W/blob/master/extractor/extractors/environment_extractor.py we query each phrase that is a LOCATION so that we can for the sake of clean architecture perform the nominatim querying also in preprocessing.

fhamborg commented 6 years ago

i think there is no speed gain if (as it is now) thhe nominatim extraction is within the phrase extractor, because by definition the phrase extractor retrieves geopositions (using nominatim) for each NER. so why not move the whole extraction into the preprocessing, run it always, and also have it cached.

fhamborg commented 6 years ago

should be integrated with preprocessor_core_nlp.py