SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

Printing to Terminal #509

Open reedjosh opened 5 years ago

reedjosh commented 5 years ago

Hey, I just came to Daru from Pandas.

I write a lot of scripts at work that report out in the terminal. With Pandas I simply print(df) and possibly change a few features like max_colwidth=50.

I can't seem to nicely print a Daru DataFrame to the terminal. Inspect is the closest I seem to be able to find, but then the columns are all the same size (as opposed to best fit) and the object type is printed too.

puts df just seems to put the type, and so does puts df.to_s

Thanks for your guidance!

kojix2 commented 5 years ago

Hello. Welcome to Daru. I am a Daru user. Yes, It's hard to use Daru on the terminal.

  1. You should use Jupyter Notebook/Lab if possible. Does IRuby not work well? Since iruby gem is old, it is better to install from github's master using specific_install.
  1. The width of the column shown on the terminal is specified here. All columns are the same width, not best fit. We need a contribution.

https://github.com/SciRuby/daru/blob/d17887e279c27e2b2ec7a252627c04b4b03c0a7a/lib/daru/formatters/table.rb#L16-L17

  1. Daru has no option settings. We need improvement. https://github.com/SciRuby/daru/issues/502
reedjosh commented 5 years ago
  1. It sounds like you are suggesting IRuby as a way to get notebook like output on the terminal. Is that the case?

  2. I did find the Spacing and Threshold variables. Unfortunately, my values do vary widely in width. : )

I would be interested in contributing to this, I have briefly looked into the code for inspect. How would you envision I add a best fit printing function? It seems pretty_print is a missing method atm?

  1. Ah, so no way to set max_colwidth and forget?
kojix2 commented 5 years ago

@reedjosh

  1. No, Column widths do not change with IRuby console/qtconsole. Sorry for my lack of explanation. I'm not a web developer. If you are working on rails, the following tools may be useful, (as you might have found).

  2. I'm glad to hear that. Ping @Shekharrajak

  3. I just forget column width on a terminal and use Jupyter Lab. But, it is clear that Daru should have option settings.

Shekharrajak commented 5 years ago

Thanks for reporting the issue. You can see the examples in IRuby notebook here.

@reedjosh , do you want to see all the colums (or all the rows)?

I see that Pandas in Python shell and daru in Ruby cell displays similar table structure.

[1] pry(main)> df = Daru::DataFrame.new([[1,2,3,4], [1,2,3,4]],order: [:a, :b], index: [:one, :two, :three, :four])
=> #<Daru::DataFrame(4x2)>
           a     b
   one     1     1
   two     2     2
 three     3     3
  four     4     4
>>> pd.DataFrame([[1,1], [2,2], [3,3], [4,4]], index=['one', 'two', 'three', 'four'], columns=['a','b'])
       a  b
one    1  1
two    2  2
three  3  3
four   4  4
kojix2 commented 5 years ago

Pandas

image

Daru

image

The data I usually see does not fits in column widths. For example

image

Shekharrajak commented 5 years ago

Thanks @kojix2 for letting me know this point. Yes! It should be adjusted and have option for setting the max width of the all the columns.

reedjosh commented 5 years ago

Yes, Pandas auto-adjusts width, and Pandas prints without first printing the object.

I use pandas to display data on the command line quite often.

For now, I've setup to use tj/terminal-table for my final output. It works well enough.

Should I get around to it, I will work on this, but I won't be able to do so anytime soon. Thanks all for the consideration and work you've done to build DARU. It's much nicer than manually manipulating data!! : )

Shekharrajak commented 5 years ago

Thanks @reedjosh for letting us know. Surely table structure will be improved in near future.