cube2222 / octosql

OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
Mozilla Public License 2.0
4.74k stars 201 forks source link

--output csv scientific notation and <nil> #300

Closed chapmanjacobd closed 1 year ago

chapmanjacobd commented 1 year ago

I don't think it should be printing in scientific notation... ? I just want an int

unzstd --memory=2048MB --stdout RS_2008-02.zst | octosql -o csv "
                                      SELECT 
                                          url as path
                                          , author
                                          , author_flair_text
                                          , created_utc as time_created
                                          , edited as time_modified
                                          , over_18 as is_over_18
                                          , archived as is_archived
                                          , is_self
                                          , is_video
                                          , link_flair_text
                                          , num_comments
                                          , num_crossposts
                                          , score
                                          , title
                                          , subreddit as playlist_path
                                      FROM stdin.json
                                  " | head -4
http://dancetubetv.com/watch/dffea144f9ac5dc9f339/optimus-prime-evolution-of-dance-tribute,dancetubetv,<nil>,1.204329569e+09,false,false,true,false,false,<nil>,0,0,0,optimus prime evolution of dance tribute,reddit.com
http://www.my-political-blog.com/index.php?page=videos&amp;section=view&amp;vid_id=101171,shan1011,<nil>,1.204329573e+09,false,false,true,false,false,<nil>,0,0,0,UFC 82 Countdown - Heath Herring,reddit.com
http://findvideo.us/fishing,[deleted],<nil>,1.20432958e+09,false,false,true,false,false,<nil>,1,0,1,Fishing Home Videos - searchable site,reddit.com
http://www.belfasttelegraph.co.uk/news/world-news/article3477119.ece,[deleted],<nil>,1.204329592e+09,false,false,true,false,false,<nil>,0,0,1,Why the only English mayor in France is worried about losing his kingdom,reddit.com

Also not sure why are there also... is nil "null" in go? While there is not standard for CSV, the convention is no char output: 1,,3

There is probably a good CSV library in go which can ensure proper quoting, etc ??

cube2222 commented 1 year ago

The scientific notation bit is actually partly a fault of JSON as a standard. JSON - according to the spec - doesn't support integers, everything is a float, and for very big floats they're printed in scientific notation.

The nil bit I agree, thanks for the report!

cube2222 commented 1 year ago

Both of these have now been fixed in https://github.com/cube2222/octosql/commit/2e6b0cdb2fc53660ebebdc243bd397df348b0351 and released in 0.11.1.

  1. Null is now properly printed in CSV as an empty field.
  2. Even though everything is still a float in JSON, the CSV output will not use scientific notation anymore. Instead, it prints the float with as many precision positions as are necessary, which means it will be an int if there is no fractional value.

Thanks for the report!