edgi-govdata-archiving / archivers.space

🗄 Event data management app used at DataRescues
https://www.archivers.space/
GNU Affero General Public License v3.0
6 stars 3 forks source link

duplicate fields in json file? #54

Open jschell42 opened 7 years ago

jschell42 commented 7 years ago

In looking at the new metadata json file, there now seem to be duplicate fields, specifically: Individual source or seed URL and url name of resource and title

Is this auto populating mixing with the manual fields we had before?


{
    "Date of capture": "Wed Mar 01 2017 10:06:53 GMT-0600 (CST)",
    "File formats contained in package": "asp",
    "Free text description of capture process": "python script",
    "Individual source or seed URL": "http://directory.psc.gov/employee.htm",
    "Institution facilitating the data capture creation and packaging": "",
    "Name of package creator": "jschell42",
    "Name of resource": "",
    "Type(s) of content in package": "Directory of all employees of the department of health and human services, including Name, Organization, Job Title, Duty station",
    "UUID": "04ACC072-8D18-4FF3-A82F-9B9FFB7A7689",
    "recommended_approach": "Can search by only 1 letter, but only returns up to 500 results at a time, NOT paginated. Needs recursive scraping.",
    "significance": "Directory of all employees of the department of health and human services, including Name, Organization, Job Title, Duty station",
    "title": "HHS Employee Directory",
    "url": "http://directory.psc.gov/employee.htm"
}
dcwalk commented 7 years ago

@kmcculloch and @b5 could we confirm whether these are in fact duplicates?