This crawler did not handle the header of the country code CSV files correctly and created a URL and Tag node consisting of the header fields.
This refactor uses pandas to parse the CSV file and also removes logging to stderr.
How Has This Been Tested?
Rerun of the crawler.
Screenshots (if appropriate):
MATCH p = (:URL)-[:CATEGORIZED {reference_name: 'citizenlab.urldb'}]->(:Tag {label: 'category_description'})
RETURN p
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
[x] My code follows the code style of this project.
[ ] My change requires a change to the documentation.
Description
This crawler did not handle the header of the country code CSV files correctly and created a URL and Tag node consisting of the header fields. This refactor uses pandas to parse the CSV file and also removes logging to stderr.
How Has This Been Tested?
Rerun of the crawler.
Screenshots (if appropriate):
Types of changes
Checklist: