Closed iboates closed 9 years ago
Hi, If you have a json file with a structure like:
{
"candidates": [
{"fullName": "Candidate 1"},
{"fullName": "Candidate 2"}
]
You would want to pass in candidates
in place of nodes
.
python gen_outline.py --collection candidates F:\electoral_map\candidates_python\candidates0_to_250.json
I am now getting the error "AttributeError: dict object has no attribute 'iteritems'"
I'm thinking that maybe my json is too complicated to be parsed? These are the first three items:
{
"objects": [
{"first_name": "Pascale", "last_name": "D\u00e9ry", "election_name": "House of Commons", "name": "Pascale D\u00e9ry", "elected_office": "candidate", "url": "", "gender": "", "extra": {}, "related": {"boundary_url": "/boundaries/federal-electoral-districts-next-election/24025/", "election_url": "/elections/house-of-commons/"}, "source_url": "http://www.conservative.ca/?member=candidates", "offices": [], "party_name": "Conservative", "incumbent": null, "district_name": "Drummond", "email": "", "personal_url": "http://www.conservative.ca/team/member/?fname=Pascale&lname=D\u00e9ry&type=candidates", "photo_url": "http://www.conservative.ca/media/team/Pascale-Dery.jpg"},
{"first_name": "Christine", "last_name": "Poirier", "election_name": "House of Commons", "name": "Christine Poirier", "elected_office": "candidate", "url": "", "gender": "F", "extra": {"twitter": "https://twitter.com/iciChristine", "facebook": "https://www.facebook.com/iciChristine.ca"}, "related": {"boundary_url": "/boundaries/federal-electoral-districts-next-election/24039/", "election_url": "/elections/house-of-commons/"}, "source_url": "https://www.liberal.ca/candidates/", "offices": [], "party_name": "Liberal", "incumbent": null, "district_name": "", "email": "Christine@iciChristine.ca", "personal_url": "http://christinepoirier.liberal.ca/", "photo_url": "https://www.liberal.ca/files/2014/06/Christine-Poirier-cropped.png"},
{"first_name": "Andrew", "last_name": "Seagram", "election_name": "House of Commons", "name": "Andrew Seagram", "elected_office": "candidate", "url": "", "gender": "M", "extra": {"twitter": "https://twitter.com/AndrewSeagram", "facebook": "https://fb.com/ASeagramNDP"}, "related": {"boundary_url": "/boundaries/federal-electoral-districts-next-election/35032/", "election_url": "/elections/house-of-commons/"}, "source_url": "http://www.ndp.ca/candidates", "offices": [], "party_name": "NDP", "incumbent": null, "district_name": "", "email": "", "personal_url": "http://andrewseagram.ndp.ca", "photo_url": "http://xfer.ndp.ca/2015/-CandidateWebAssets/35032-DON.png"},
...
I used the command:
python gen_outline.py --collection objects F:\electoral_map\candidates_python\candidates0_to_250.json
Thank you very much for your help so far.
EDIT: I formatted one entry of the json for easy viewing, in case that helps:
{
"objects": [
{"first_name": "Pascale",
"last_name": "D\u00e9ry",
"election_name": "House of Commons",
"name": "Pascale D\u00e9ry",
"elected_office": "candidate",
"url": "",
"gender": "",
"extra":
{},
"related":
{"boundary_url": "/boundaries/federal-electoral-districts-next-election/24025/",
"election_url": "/elections/house-of-commons/"},
"source_url": "http://www.conservative.ca/?member=candidates",
"offices": [],
"party_name": "Conservative",
"incumbent": null,
"district_name": "Drummond",
"email": "",
"personal_url": "http://www.conservative.ca/team/member/?fname=Pascale&lname=D\u00e9ry&type=candidates",
"photo_url": "http://www.conservative.ca/media/team/Pascale-Dery.jpg"},
Oh, that looks to be an issue with Python 3 vs 2 compatibility, most of this stuff is written for python 2.7 (A lot of people still have 2.x by default and I was trying to keep the additional requirements to a minimum). If you can run it with python2 that's probably the easiest solution.
Otherwise we'll need to create a python 3 compatible branch. Happy to help a fellow Canadian with an interest in politics.
Thanks a lot, I ran it in Python 2.7 and unfortunately have hit yet another problem, this one is in regards to unicode characters. For instance, they very first object contains:
"last_name": "D\u00e9ry"
And it seems as though the json2csv script does not like this unicode character and spits this out:
Traceback (most recent call last):
File "json2csv.py", line 155 in <module>
loader.write_csv(filename=outfile, make_strikes=args.strings)
File "json2csv.py", line 105, in write_csv
writer.writerows(out)
File "C:\Python27\lib\csv.py", line 158, in writerows
return self.writer.writerows(rows)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position1: ordinal not in range(128)
I thought maybe it was an issue in how I was grabbing the data from my source, but the unicode characters themselves are clearly visible in the output so I think it is in your script. I'm going to see if I can look under the hood to find a way to encode it to utf-8 properly but it looks like a lot of it is quite over my head so I doubt I'll get far.
Thanks again for all your help & replies.
EDIT: I have found that if I change line 91 to
return unicode(item, encoding='utf-8')
I get a different error:
Traceback (most recent call last):
File "json2csv.py", line 155, in <module>
loader.write_csv(filename=outfile, make_strings=args.strings)
File "json2csv.py", line 99, in write_csv
out = self.make_strings()
File "json2csv.py", line 82, in make_strings
for k, val in row.items()})
File "json2csv.py", line 82 in <dictcomp>
for k, val in row.items():)
File json2csv.py", line 91, in make_string
return unicode(item, encoding='utf-8')
TypeError: decoding Unicode is not supported
The csv
library in Python 2.7 assumes ascii by default, but most of my data sources tend to be UTF-8, all you need to do is pip install unicodecsv
(it's currently the only requirement in requirements.txt)
JSON always assumes UTF-8 which is why it works out of the box. I'll add a note about UTF-8 support to the README
It's working!
You are officially my favourite person I've met on the internet. Thank you SO much for A. this wonderful script and B. patiently troubleshooting all of this crap for me. I'm very new to github, is there any way I can give you some kind of github "+rep" for this? Because you earned it about 100 times over.
Hi, I am trying to get gen_outline to work. In the docs it says to use
I am using exactly this:
And I get KeyError: 'nodes', so I took it out and just tried
And then it says that it is missing the argument "json_file"
Am I just not entering it right? Please help I am new to github