denevers / PyHarvest

Naive harvesting of RDF into a postgresql database
GNU General Public License v3.0
2 stars 1 forks source link

Initial refactoring of baseline functionality to support revised Nodes #21

Open jvanulde opened 2 years ago

jvanulde commented 2 years ago

Moved to requests module for accessing Canadian node since it has a broken certificate chain allowing for option to skip certificate verification. Also, worked around issue where XML from US node is not valid, at least as far as RDFLib is concerned. @denevers please run crawler.py and review the output. Next steps are to containerize the harvester and have it post the triples to our data store.