Open cris-cola opened 10 years ago
Can you provide a short sample of the input from tweets.json? Thanks.
yes, this is it:
"tweets.json" ---> http://codeshare.io/N3flN
inputing tweets.json I receive:
GNIPREMOVE-Unidentified,2014-06-06T16:22:46.000Z,Unidentified meta message
Otherwise, if I input a second .json file, tweets2.json, containing:
"tweets2.json" -> http://codeshare.io/WfJ2i
etc...
I receive as output:
GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message GNIPREMOVE-Unidentified,2014-06-06T16:25:45.000Z,Unidentified meta message
etc.
Gnacs was designed to consume a real time stream, so it is looking for a succession of dictionary objects rather than a list. If you take off the [ and ] in your list (yes, that turns valid json into a sequence of valid json records that is not as a whole a valid json record) I think you will have success.
Thanks,
DrS
On Fri, Jun 6, 2014 at 8:11 AM, kaine987 notifications@github.com wrote:
yes, this is it:
[{"created_at":"Wed Jun 04 14:02:16 +0000 2014","id":474189394271010816,"id_str":"474189394271010816","text":"CNN Poll: Public upset over VA scandal; Obama remains at 43% http://t.co/iitvUnYBoA","source":"iOS","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":1113229394,"id_str":"1113229394","name":"Hiram","screen_name":"HiramSipesh","location":"","url":null,"description":"Jusy want the world to be a better place!","protected":false,"followers_count":184,"friends_count":633,"listed_count":0,"created_at":"Wed Jan 23 02:32:03 +0000 2013","favourites_count":247,"utc_offset":null,"time_zone":null,"geo_enabled":false,"verified":false,"statuses_count":2013,"lang":"en","contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"C0DEED","profile_background_image_url":" http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_image_url":"http://pbs.twimg.com/profile_images/449360306759008256/nE9IZpIb_normal.jpeg","profile_image_url_https":"https://pbs.twimg.com/profile_images/449360306759008256/nE9IZpIb_normal.jpeg","profile_banner_url":"https://pbs.twimg.com/profile_banners/1113229394/1360892942","profile_link_color":"0084B4","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"symbols":[],"urls":[{"url":"http://t.co/iitvUnYBoA","expanded_url":"http://politicalticker.blogs.cnn.com/2014/06/03/cnn-poll-public-upset-over-va-scandal-obama-remains-at-43/","display_url":"politicalticker.blogs.cnn.com/2014/06/03/cnn…","indices":[61,83]}],"user_mentions":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"medium","lang":"en"},{"created_at":"Wed Jun 04 14:02:16 +0000 2014","id":474189395692908545,"id_str":"474189395692908545","text":"#Like
ACA #Obamacare! RT @JonathanTurley: Southwest Fined For Advertising $59
Seats That Did Not Exist http://t.co/FqwVHy2guE","source":"Hootsuite","truncated":false,"in_reply_to_status_id":474177225462267904,"in_reply_to_status_id_str":"474177225462267904","in_reply_to_user_id":94784682,"in_reply_to_user_id_str":"94784682","in_reply_to_screen_name":"JonathanTurley","user":{"id":24993788,"id_str":"24993788","name":"Dan Kleinman","screen_name":"SafeLibraries","location":"USA","url":" http://safelibraries.blogspot.com/","description":"SafeLibraries - Are Children Safe in Public Libraries? Tweet/RT all sides of ☛library crime ☛free speech ☛censorship ☛authors ☛child safety ☛bullying ☛teaching","protected":false,"followers_count":1111,"friends_count":1995,"listed_count":49,"created_at":"Wed Mar 18 00:47:18 +0000 2009","favourites_count":2016,"utc_offset":-14400,"time_zone":"Eastern Time (US & Canada)","geo_enabled":false,"verified":false,"statuses_count":44372,"lang":"en","contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"EDECE9","profile_background_image_url":" http://pbs.twimg.com/profile_background_images/267910806/ilovemylibrarian.jpg","profile_background_image_url_https":"https://pbs.twimg.com/profile_background_images/267910806/ilovemylibrarian.jpg","profile_background_tile":true,"profile_image_url":"http://pbs.twimg.com/profile_images/1447790466/4c198dc6-0d19-42ed-8fea-1c41a34367b8_normal.png","profile_image_url_https":"https://pbs.twimg.com/profile_images/1447790466/4c198dc6-0d19-42ed-8fea-1c41a34367b8_normal.png","profile_banner_url":"https://pbs.twimg.com/profile_banners/24993788/1348608084","profile_link_color":"088253","profile_sidebar_border_color":"D3D2CF","profile_sidebar_fill_color":"E3E2DE","profile_text_color":"634047","profile_use_background_image":true,"default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"Like","indices":[0,5]},{"text":"ACA","indices":[6,10]},{"text":"Obamacare","indices":[11,21]}],"symbols":[],"urls":[{"url":"http://t.co/FqwVHy2guE","expanded_url":"http://wp.me/p6sYP-kMn","display_url":"wp.me/p6sYP-kMn"omhp-stud-146-50-221-89:json cristianocolacillo$ :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: :::::::::: -bash: ::::::::::: command not found omhp-stud-146-50-221-89:json cristianocolacillo$ cat tweets.json [{"created_at":"Wed Jun 04 14:02:16 +0000 2014","id":474189394271010816,"id_str":"474189394271010816","text":"CNN Poll: Public upset over VA scandal; Obama remains at 43% http://t.co/iitvUnYBoA","source":"iOS","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":1113229394,"id_str":"1113229394","name":"Hiram","screen_name":"HiramSipesh","location":"","url":null,"description":"Jusy want the world to be a better place!","protected":false,"followers_count":184,"friends_count":633,"listed_count":0,"created_at":"Wed Jan 23 02:32:03 +0000 2013","favourites_count":247,"utc_offset":null,"time_zone":null,"geo_enabled":false,"verified":false,"statuses_count":2013,"lang":"en","contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"C0DEED","profile_background_image_url":" http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_image_url":"http://pbs.twimg.com/profile_images/449360306759008256/nE9IZpIb_normal.jpeg","profile_image_url_https":"https://pbs.twimg.com/profile_images/449360306759008256/nE9IZpIb_normal.jpeg","profile_banner_url":"https://pbs.twimg.com/profile_banners/1113229394/1360892942","profile_link_color":"0084B4","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"symbols":[],"urls":[{"url":"http://t.co/iitvUnYBoA","expanded_url":"http://politicalticker.blogs.cnn.com/2014/06/03/cnn-poll-public-upset-over-va-scandal-obama-remains-at-43/","display_url":"politicalticker.blogs.cnn.com/2014/06/03/cnn…","indices":[61,83]}],"user_mentions":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"medium","lang":"en"},{"created_at":"Wed Jun 04 14:02:16 +0000 2014","id":474189395692908545,"id_str":"474189395692908545","text":"#Like
ACA #Obamacare! RT @JonathanTurley: Southwest Fined For Advertising $59
Seats That Did Not Exist http://t.co/FqwVHy2guE","source":"Hootsuite","truncated":false,"in_reply_to_status_id":474177225462267904,"in_reply_to_status_id_str":"474177225462267904","in_reply_to_user_id":94784682,"in_reply_to_user_id_str":"94784682","in_reply_to_screen_name":"JonathanTurley","user":{"id":24993788,"id_str":"24993788","name":"Dan Kleinman","screen_name":"SafeLibraries","location":"USA","url":" http://safelibraries.blogspot.com/","description":"SafeLibraries - Are Children Safe in Public Libraries? Tweet/RT all sides of ☛library crime ☛free speech ☛censorship ☛authors ☛child safety ☛bullying ☛teaching","protected":false,"followers_count":1111,"friends_count":1995,"listed_count":49,"created_at":"Wed Mar 18 00:47:18 +0000 2009","favourites_count":2016,"utc_offset":-14400,"time_zone":"Eastern Time (US & Canada)","geo_enabled":false,"verified":false,"statuses_count":44372,"lang":"en","contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"EDECE9","profile_background_image_url":" http://pbs.twimg.com/profile_background_images/267910806/ilovemylibrarian.jpg","profile_background_image_url_https":"https://pbs.twimg.com/profile_background_images/267910806/ilovemylibrarian.jpg","profile_background_tile":true,"profile_image_url":"http://pbs.twimg.com/profile_images/1447790466/4c198dc6-0d19-42ed-8fea-1c41a34367b8_normal.png","profile_image_url_https":"https://pbs.twimg.com/profile_images/1447790466/4c198dc6-0d19-42ed-8fea-1c41a34367b8_normal.png","profile_banner_url":"https://pbs.twimg.com/profile_banners/24993788/1348608084","profile_link_color":"088253","profile_sidebar_border_color":"D3D2CF","profile_sidebar_fill_color":"E3E2DE","profile_text_color":"634047","profile_use_background_image":true,"default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"Like","indices":[0,5]},{"text":"ACA","indices":[6,10]},{"text":"Obamacare","indices":[11,21]}],"symbols":[],"urls":[{"url":"http://t.co/FqwVHy2guE","expanded_url":"http://wp.me/p6sYP-kMn","display_url":"wp.me/p6sYP-kMn","indices":[104,126]}],"user_mentions":[{"screen_name":"JonathanTurley","name":"Jonathan Turley","id":94784682,"id_str":"94784682","indices":[26,41]}]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"medium","lang":"en","NewObject-25":{}}, ..... ]
— Reply to this email directly or view it on GitHub https://github.com/DrSkippy/Gnacs/issues/33#issuecomment-45340619.
Scott Hendrickson 303.219.0022 @drskippy27 (transparency is the antidote for cynicism)
Thank you now it works
Is it possible to print all the fields contained in the .json file as output in the .csv file? Which options does it? thank you
I have used the given examples, and it works with your data samples, but why it only prints the "id", "postedTime" and the status? why not all the fields contained in the .json input?
moreover, if I use the same command with my own .json it doesn't print out a .csv file but returns the very same .json file and says 'skipping' in the end...
Not today, but the ability to do that easily is coming in a refactor in a week or so.
DrS
On Mon, Jun 9, 2014 at 11:38 AM, kaine987 notifications@github.com wrote:
Is it possible to print all the fields contained in the .json file in input in a .csv file? Which options does it? thank you
— Reply to this email directly or view it on GitHub https://github.com/DrSkippy/Gnacs/issues/33#issuecomment-45519521.
Scott Hendrickson 303.219.0022 @drskippy27 (transparency is the antidote for cynicism)
Hi, maybe I am not sure of how to use your code, I want to get a simple csv file out of my .json data got from twitter, but apparently I am doing something wrong.
I am using this command to create a csv file:
gnacs.py -c json/tweets.json > output.xls
but instead of the columns inside my .json file what I get instead is:
GNIPREMOVE-Unidentified,2014-06-06T16:22:46.000Z,Unidentified meta message
... etcetera...
This is strange considering that I have tried the same commands with your sample data/*.json files and it was working.
Why is that?
Thank you