ronisbr / TerminalPager.jl

Pure Julia implementation of the command less
MIT License
116 stars 8 forks source link

Add optional argument to load only first n lines #26

Closed tp2750 closed 1 year ago

tp2750 commented 1 year ago

As mentioned in #10 it can take a long time to load a large table, and in many cases, I do not actually want to page through thousands of rows.

Would it be ok to add an optional second argument to only load the first n lines?

As example:

julia> m1 = rand(1000, 1000);
julia> d1 = DataFrame(m1, :auto);

Then pager(d1, 100) should conceptually do the same as pager(first(d1,100)) .

For DataFrames the overhead of writing pager(first(d1,100)) rather than pager(d1, 100) is not so big, but for matrices we would need to write pager(m1[1:100,:]):

julia> pager(first(d1,5))
5×100 DataFrame
 Row │ x1        x2        x3        x4        x5         x6         x7        x8        x9        x10       x11       x12
     │ Float64   Float64   Float64   Float64   Float64    Float64    Float64   Float64   Float64   Float64   Float64   Flo
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 0.242587  0.338178  0.175047  0.526432  0.0752374  0.0826068  0.50735   0.488078  0.431373  0.386281  0.50296   0.8
   2 │ 0.126071  0.481133  0.108992  0.8893    0.519607   0.997476   0.623974  0.779318  0.782236  0.822382  0.74265   0.4
   3 │ 0.918782  0.170586  0.706512  0.228009  0.770864   0.193122   0.119362  0.331636  0.252595  0.694127  0.269961  0.8
   4 │ 0.239073  0.339586  0.560449  0.956554  0.572953   0.376078   0.396257  0.412697  0.518571  0.762955  0.789057  0.5
   5 │ 0.564384  0.618212  0.169127  0.691747  0.627538   0.679062   0.336765  0.76703   0.881205  0.467188  0.164898  0.9

julia> pager(first(m1,5))
5-element Vector{Float64}:
 0.32228426040689684
 0.09436948434039771
 0.01952993269383163
 0.5322686848549102
 0.4830636090242214

julia> pager(m1[1:5,:])
5×100 Matrix{Float64}:
 0.322284   0.0706254  0.691225   0.334359   0.605738  0.0657363  0.790047  0.140566  0.736919  0.644921  0.111976  0.5730
 0.0943695  0.964375   0.455835   0.0302767  0.603699  0.820126   0.711371  0.616206  0.968865  0.37598   0.10508   0.7004
 0.0195299  0.0757658  0.74813    0.145286   0.889892  0.448485   0.977724  0.494413  0.679627  0.262876  0.680987  0.3365
 0.532269   0.738045   0.0826661  0.807034   0.215173  0.749381   0.200094  0.130695  0.367826  0.248304  0.317622  0.5663
 0.483064   0.890258   0.587399   0.515627   0.170361  0.68068    0.883226  0.943145  0.487144  0.292369  0.482398  0.1293

We could also consider a third argument to limit the number of columns, but in practice, I find that tables tend to be taller rather than wider.

Just be clear: the use case is to allow the user an easy way to specify that the table is large, and it is ok to only load the first n lines with the limitations that gives. E.g. that searching will only happen in that first part of the table.

ronisbr commented 1 year ago

Hi @tp2750 !

Unfortunately, it will not change anything regarding the loading time. TerminalPager.jl is not data aware. Hence, it does not know if we are rendering a matrix, a vector, Markdown, etc. When you call pager, it immediately calls show to render the object to a string. The slow process is in this step.

It will be very complicated to support different types of objects in TerminalPager.jl. What we need to do, and I am thinking about it, is creating a new package called TablesPager.jl or something that uses the code here to render objects that follows Tables.jl API. In this case, we can solve this slowdown for all Tables.jl.

tp2750 commented 1 year ago

Thank you for the reply @ronisbr

It sounds like a table specific pager would be great. I'll close this for now then.

It is already a great package for tables :-)

Thanks a lot!

ronisbr commented 1 year ago

Perfect! I have some ideias and a TablesPages would solve a lot of problems :)