burnash / gspread

Google Sheets Python API
https://docs.gspread.org
MIT License
7.11k stars 948 forks source link

cell.expand('table') feature requests #1414

Closed pabloazurduy closed 1 month ago

pabloazurduy commented 8 months ago

In the past, I used to use xlwings library for Excel. In their API they have a very handy way of selecting "tables" starting for a specific cell.

you could use it, for example, to get data from a series of sheets that have table with variable size but always start from the cell "A1". so you "expand" the selection similar to what you do when do cmd+left_arrow, cmd+down_arrow on the worksheet,

The usage is better explained in the documentation: range.expand('table')

this is a snippet of the usage, expand have one of three arguments "table", "right","down".

import xlwings as xw
wb = xw.Book()
sheet1 = wb.sheets[0]
sheet1.range('A1').value = [[None, 1], [2, 3]]
sheet1.range('A1').expand().address
sheet1.range('A1').expand('table').addres

I implemented the "table" version for myself just because I found it very handy.

    def expand_table(worksheet: gspread.worksheet.Worksheet, cell: gspread.Cell) -> str:
        """
        Expands a cell range in gspread based on non-null adjacent cells, similar to
        xlwings.Range('A1').expand('table').
        """

        if not cell.value:
            raise ValueError(f"Cell at {cell} is empty.")

        row_values = worksheet.row_values(cell.row)
        row_values = row_values[row_values.index(cell.value):]
        row_values = row_values[:row_values.index('')] if '' in row_values else row_values
        right_boundary_col = cell.col + len(row_values) -1

        column_values = worksheet.col_values(right_boundary_col)
        column_values = column_values[column_values.index(worksheet.cell(row=cell.row,col=right_boundary_col).value):]
        column_values = column_values[:column_values.index('')] if '' in column_values else column_values
        bottom_boundary_row = cell.row + len(column_values) -1

        up_a1 = gspread.utils.rowcol_to_a1(cell.row,cell.col)
        low_a1 = gspread.utils.rowcol_to_a1(bottom_boundary_row,right_boundary_col )

        return f'{up_a1}:{low_a1}' 

maybe there's a way of doing this using gspread, but I couldn't find it.

Thanks in advance!

lavigne958 commented 8 months ago

Hi thank you for this feature request. It's very interesting! I never thought of it !

Could you please confirm if I get your code correctly?

I understand your function:

  1. Takes a Cell object (row + column + value)
  2. It finds the furthest value on the right
  3. It find the furthest value at the bottom
  4. Return the coordinates of the furthest bottom right value

Right ?

If so, it could be improved yes, and possibly the API has ways to help, or at least we could play with the API and the coordinates system as it can in some scenarios return a matrix of all values only, so it's then easy to compute the furthest value has at least or direction (bottom or right) is already given by the API

lavigne958 commented 8 months ago

We can take a look at it, we have a few things to work on first then, if it's easy it could land in 6.1.0 if not then it's gonna be in the next version after that (at least 6.2.0)

pabloazurduy commented 8 months ago

hi @lavigne958 , I edited the post because it wasn't very clear, my apologies.

I added a snippet with a rudimentary implementation of the expand("table") case. however, it has some flaws:

  1. for checking nulls I do "" search. probably there's a more elegant way
  2. the rowcol_to_a1 transformation seems unnecessary, converting row and column numbers to text maybe its not necessary. Would it be better to return a CellRange ? (or an equivalent object ?)
  3. expanding "right" or "down" are just two specific cases of the same code, probably it's easy to add them

also, I'm not very familiar with the modules, but if you guide me on this I might be able to do a PR ? let me know :) thanks in advance

alifeee commented 8 months ago

Hi. I find images the most helpful.

Is this what you mean?

image

pabloazurduy commented 8 months ago

Hi. I find images the most helpful.

Is this what you mean?

[image]

yess, exactly that

alifeee commented 8 months ago

cool :)

it sounds like it could be a useful function.

Since it is unknown, it would involve getting the whole sheet, down and right, starting from cell.

Is there a way to get this in gspread currently?

If so, we could add a "table" function to utils, which could be used something like

all_cells = worksheet.get()
table = utils.find_table(all_cells)

I think that would be nice as this feature does not require extra requests, but only formatting of data we can get by existing means.

Are you interested in trying to make a PR? We can help if so :)

lavigne958 commented 8 months ago

cool :)

it sounds like it could be a useful function.

Agreed 🙃

Since it is unknown, it would involve getting the whole sheet, down and right, starting from cell.

Is there a way to get this in gspread currently?

I think, if you request an unbounded range from that cell, the API will return everything down from that point. Then all we need is to square it and remove everything after the first whitespace. 🤔

lavigne958 commented 1 month ago

alright ! we made it ! next release will contain the new feature.