Closed CalJaDav closed 1 year ago
Thanks for the edits
About the redraws, it should only redraw once if there's a chain of commands with no updates in between because it checks if there's an existing call to redraw, there may be a couple of functions where this isn't the case I'm not sure
def set_refresh_timer(self, redraw: bool = True) -> None:
if redraw and self.after_redraw_id is None:
self.after_redraw_id = self.after(self.after_redraw_time_ms, self.after_redraw)
About get and set, I agree, I wrote those functions with many uses before I learned that it's much better to keep things simple and straight forward but I am reluctant to change them due to backwards compatibility, I am not sure
About going through the front end, I don't know how much work that involves, if you think it's not too much then I guess sure, but maybe don't change the redraws for now until I get a better look at them
I am sorry to report that there's going to be a delay on releasing the new named ranges and event data version because I decided to add the ability to move and delete non-consecutive rows/columns so I got started on that, the benefit is that it should simplify the code in these areas, the drawback is it will take a little while and it will mean a small change in the event data for moved/added rows/columns (the added will change because of the need to undo delete columns and get event data for that) but in both cases it will involve a simple list of indexes
I am sorry to report that there's going to be a delay on releasing the new named ranges and event data version because I decided to add the ability to move and delete non-consecutive rows/columns so I got started on that, the benefit is that it should simplify the code in these areas, the drawback is it will take a little while and it will mean a small change in the event data for moved/added rows/columns (the added will change because of the need to undo delete columns and get event data for that) but in both cases it will involve a simple list of indexes
Not a problem, take your time. I had the distinct impression while writing my implementation of event data moved rows/cols that non-consecutive moves where going to be tricky. I'm all for moving slowly and carefully here. Named ranges will make some things easier going forward, but not some much easier that I need them right this instant.
About get and set, I agree, I wrote those functions with many uses before I learned that it's much better to keep things simple and straight forward but I am reluctant to change them due to backwards compatibility, I am not sure.
Yeah, this is always tricky but I think since we are moving towards a new version and dropping 3.6 there may be some precedent for losing a few rough edges if it makes the codebase more future-proof. We can, in the time being, flag the functions we might want to change with a deprecation warning (and post a warning for users of 3.6) on 6.xx. We can also patch in the new methods on 6.xx alongside their older counterparts, so that users can transition over before we break things. Given the amount of work left to do on 7.xx, this should give users plenty of warning about the upcoming changes. Ultimately, programmers are fundamentally lazy people and they won't start making changes until everything breaks, which is exactly what happened to my team when the new Pandas update dropped and deprecated a bunch of horrible spaghetti code which were puking out warnings that we ignored for months, but hey. You live and you learn, eh? (We don't, it will happen again).
About the redraws, it should only redraw once if there's a chain of commands with no updates in between because it checks if there's an existing call to redraw.
Huh, cool. I've stared at that function many times but hadn't put that together. There may be no need, you're a clever cookie.
I wish I could take credit for that redraw thing but someone else made the suggestion :P
Could I ask your opinion, with the named ranges, if you move a column into a named range should the range grow by one column or push the end column out of it
Also moving a column out of a named range, should it shrink the named range or pull a column from the end into it
It's a bit complicated because what if the named range borders the end, instead of pulling a column from the end into it, it would modify a column which was initially before it (but is now in the named range after the movement)
I don't think it would be contradictory to have the named range remain static even though it shrinks and grows with deleting/adding columns
But perhaps the simplest solution, if a column is moving out of a named range then it shrinks until it is deleted when the last column is moved out, it would of course be recreated to its prior dimensions upon an undo
It seems a bit weird though that a user could just keep moving columns to 2 indexes before the named range start and it eventually gets deleted (not worried about delete columns doing this).
Perhaps in the case of moving columns it's more logical for the named range coordinates to remain static? I guess this is my gut feeling on the correct behavior...
🤔
Thanks
Sorry lots of edits...
I think work needs to be done rather towards turning all possible get / set methods into properties.
A complementary builder like Interface can then be put up such that, if x
is the property then with_x
is its builder alternative. This pattern should obviously be used only where its needed
Could I ask your opinion, with the named ranges, if you move a column into a named range should the range grow by one column or push the end column out of it
Yeah, good questions. Named ranges were based around excel's feature of the same name so maybe we should just base the functionality on that, since it might be considered the expected behavior we are trying to replicate.
In excel, moving columns or into the range expands or shrinks the range. For example:
If we have the range A ... C and we swap A B C D -> A C D B the range becomes: A B
Equally, if we'd swapped A B C D -> A B D C the range becomes A B D C
Now, it does have some slightly odd behaviour, which we may or may not choose to replicate. If we move either end of the range A B outside the range: A B C D E F-> A C D E F B Then the range becomes: A C D E F B
However, moves WITHIN the range have no effect on its shape: If we have the range B C D and we swap A B C D E -> A C B D E the range remains: C B D
A complementary builder like Interface can then be put up such that, if x is the property then with_x is its builder alternative. This pattern should obviously be used only where its needed.
Copy that chief, I have been playing around with moving a lot of the interface functionality into properties with getter and setter arguments to wrap the subprocesses that need to be triggered when certain variables are changed, there are some things I am not yet sure how best to implement, maybe you can give me some thoughts.
Take setting sheet data as an example, I think we both agree that the best way to update an retrieve data would be through a Sheet.data
property; ie.
# getting data
output = sheet.data
# setting data
sheet.data = some_array
However, we need to work in some additional functionality, for instance whether to include headers. My inclination would be that it might look something like this:
# getting data with headers and index
output = sheet.include_headers().include_index().data
# setting data with an array that includes headers/index
sheet.include_headers().include_index().data = array_with_headers
Thoughts? @ragardner @demberto
@CalJaDav I believe if a setter needs more arguments / configuration than the value itself, it shouldn't be a property to begin with. However, what could be possible is that the configuration can be done later as well? (The include_headers
part?)
sheet.data = ...
sheet(.data?).include_headers = ...
I actually quite like the way xlwings handles this stuff:
# getting data in xlwings
sheet.range('A1').expand("table").options().value
# you can even do cool things like returning a dataframe...
sheet.range('A1').expand("table").options(pd.DataFrame, header=True).value
# and setting a value is handled like so:
sheet.range('A1').value = data
Perhaps this could be one way giving the sheet some additional functionality and simplify the huge number of one-off methods that all do very similar things. Specifically the implementation of a .range(cells, cols, rows, name)
method that returns a SheetRange
object that further operations can target, rather than seperate set_cells
, set_cols
, set_rows
methods.
.options
-> kwargs that can be passed along to setter/getter methods.
# Highlight rows
sheet.range(rows=[1,2,3]).bg = "#ffffff"
# Set a cells value
sheet.range((0,0)).value = 1
# insert a 2d array into the sheet, expanding rows/cols to account for any additional data that needs to be inserted.
# This is not currently possible in tksheet.
sheet.range((3,3)).options(expand="both").value = array
# Targeting the sheet rather than a range works as expected, and headers and index's can be handled as so
sheet.options(header = True, index=True).data = array
Again, I'm not sure if this is a more sensible pattern, it is probably what I would have done at one stage but I'd value your guy's thoughts.
Thanks for the ideas,
My thoughts are that this seems to be a more powerful way to set and get data
As you say it could also handle the cell options such as highlighting, formatting, dropdown boxes etc
Right now I would say generally if there were to be an overhaul this is the best way to go, I have looked at the api in pandas again but I've always found it a bit confusing, I reckon xlwings api, at least for stuff related to range
is much easier to understand
Of course, we could just change some of the functions into setter and getter functions and leave it at that but if we were to go with ‘range’
I am still quite unsure of the details of the implementation, some examples of which I haven’t included that many using ‘options’ sorry, and I also have no idea what’s actually possible as I don’t yet have a grasp of decorators
SheetRange object
Would .range(cells, cols, rows, name)
be best or perhaps
sheet.range(
# if BOTH `r` and `c` are NOT None we're referring
# to a range of cells where `r` and `c` are zipped
# to produce coordinates, it would skip instances
# where either `r` or `c` is not an `int`
# if `r` is NOT None and `c` is None then we're
# referring to a range of rows
r: Iterable | Generator | int | None = None,
# if `c` is NOT None and `r` is None then we're
# referring to a range of columns
c: Iterable | Generator | int | None = None,
name: str | None = None,
)
Example
# gets cell values for rows 0 up to not including 5, column 0
# returns e.g. ["r0 c0", "r1 c0", "r2 c0", "r3 c0", "r4 c0"]
sheet.range(r=range(5), c=0).value
Or maybe having to provide two iterables (or an int
and an iterable) for cells would be error prone?
Would name
override r
and c
in the case of getting data, as you're getting data from a named range e.g.
# returns the data from the named range "myrng" NOT rows 1, 2, 3
sheet.range(r=[1, 2, 3], name="myrng").value
Would it also work with putting table
, index
and header
in the range object?
sheet.range(r=1, c=[1, 2], index=True, header=True).value
sheet.range(r=[1, 2], table=False, index=True).highlight(**kwargs)
Could also put expand
into range
?
sheet.range(0, 0, expand="both").value = array
expand and data setting / inserting
For setting data using expand
would it push data like an insert or overwrite existing data like a paste
Or would this maybe be reserved for insert
e.g. I'm not sure what would happen here
# a bit messy or okay using so many range words
sheet.range(r=range(2), expand="down").value = [[1, 2], [3, 4]]
or
# insert columns 0: [1, 2], 1: [3, 4]
sheet.range(c=[0, 1], expand="right").insert([[1, 2], [3, 4]], widths=[120, 120], deselect=True, )
Named ranges
Creation
# a named range for rows 1, 2, 3 where all cells have a dropdown
sheet.range(r=[1, 2, 3], name="myrng").dropdown(**kwargs)
or
# a named range for rows 1, 2, 3 where all cells have a dropdown
sheet.range(r=[1, 2, 3], name="myrng").options(**kwargs).dropdown()
Adding options to a named range
The same as creation but without the r
arg, overwrites any existing of the same options
# refers to cells within named range “myrng”
sheet.range(name="myrng").dropdown(**kwargs)
or
# a named range for rows 1, 2, 3 where all cells have a dropdown
sheet.range(r=[1, 2, 3], name="myrng").options(**kwargs).dropdown()
Deleting a named range
Would this get confused with clearing values?
sheet.range(name="myrng").delete()
Deleting existing dropdowns from a named range
Unsure if this is correct syntax
sheet.range(name="myrng").dropdown.delete()
Cell options generally
Not sure which of the below would be best or if both would work
sheet.range(r=[1, 2], c=[1, 2]).highlight().delete()
sheet.range(r=[1, 2], c=[1, 2]).bg = "green"
sheet.range(r=range(10), table=False, index=True).align("global", redraw=True)
Deleting rows/columns
Not sure if this would even be a thing because it implies you could also delete cells which would pull the cells below the range up and the cells to the right of the range left
# delete columns 0 and 1
sheet.range(c=[0, 1]).delete(deselect_all=True, redraw=True)
A lot to consider ._.
My thoughts are that the .range()
is only a selection method. That is, you can use it to select and operate on certain parts of the table but it does only that. It can produce either a CellRange
, a RowRange
or a ColumnRange
, with a named range being a reference to any one of those, but you can only select ONE type of range at a time:
# Selecting a cell range
sheet.range(cells = [(0,0),(0,1),(1,0),(1,1)]) # -> CellRange
# Selecting a row range
sheet.range(rows = [1,2]) # -> RowRange
# Invalid! Cannot select more than one type of
sheet.range(cells = [(0,0),(0,1),(1,0),(1,1)], rows = [1,2]) # -> ValueError!
Named ranges will also come in three flavors (row, col, cell) and can be created something like:
sheet.range(cells = [(0,0),(0,1),(1,0),(1,1)]).name = "foo"
# and retrived like
sheet.range(name="foo") # -> CellRange
Ranges should also allow you to select areas between points, I'd probably suggest something like:
sheet.range(cells = "0,0:1,1") # = sheet.range(cells = [(0,0),(0,1),(1,0),(1,1)])
sheet.range(cols = "0:4") # = sheet.range(cols = [0,1,2,3,4])
# and maybe even using negative notation for selecting to the ends
sheet.range(cols = "0:-1") # -> All Cols
This will simply the coding a lot, since CellRanges
act on cell_options
, RowRanges
act on row_options
, etc. We already do this, pretty much all the operations on tksheet have these three flavors of operations anyway. It just will mean that we get to split out the interface into more manageable chunks.
With regards to expand, I think it probably should remain outside .range()
although I'm not too sure either way yet. Basically I see it as the solution to the reset_col_postions
arguments that currently sit in set_sheet_data
, basically telling it whether it can insert data or expand the table beyond the currently selected range like so:
Generally speaking, I don't think we should encourage expand
being used in cases where a selection would work just as well, i.e.
# We prefer this
sheet.range(cells="2,2:2,-1").clear()
# To this
sheet.range(cells=(2,2)).expand("right").clear()
There is obviously a lot to think about here, if we go ahead with this it does represent a pretty major change to your codebase, but it is one that could tidy up the library a lot.
I have been working on Column and Treeviews for quite some time on a project of mine. I was thinking if we could do something that separates the model from the view. This is a path offered by all popular UI frameworks. It won't be too much work, but its API needs to be designed carefully. Luckily we have a few existing implementations we can take ideas from. There's Qt's QStandardItemModel, Gtk has its own models. The model can raise virtual events as its data changes. If Tksheet wants to implement a multi column treeview in the future, an existing model can be reused. This mechanism can be orthogonal to the existing all-in-one implementation. I, unfortunately am not very knowledgeable about Tksheet's internals and don't know if its a worthwhile idea with the current implementation.
My thoughts are that the .range() is only a selection method. That is, you can use it to select and operate on certain parts of the table but it does only that. It can produce either a CellRange, a RowRange or a ColumnRange, with a named range being a reference to any one of those, but you can only select ONE type of range at a time
I like this approach, thank you!
I am hesitant to use the word range
in this way because of the builtin, would area
or span
be okay?
Ranges should also allow you to select areas between points
I could put a converter in for the optional of using strings
"A1:A3" # cells (0, 0), (1, 0), (2, 0)
"1:3" # rows 0, 1, 2
"A:C" # columns 0, 1, 2
I am nearly finished with the internals for named ranges (I think) so I can basically start work on the api soon, I will consider the specifics of syntax relating to the existing arguments like reset_row_positions
as you say
I have been working on Column and Treeviews for quite some time on a project of mine. I was thinking if we could do something that separates the model from the view. This is a path offered by all popular UI frameworks. It won't be too much work, but its API needs to be designed carefully. Luckily we have a few existing implementations we can take ideas from. There's Qt's QStandardItemModel, Gtk has its own models. The model can raise virtual events as its data changes. If Tksheet wants to implement a multi column treeview in the future, an existing model can be reused. This mechanism can be orthogonal to the existing all-in-one implementation. I, unfortunately am not very knowledgeable about Tksheet's internals and don't know if its a worthwhile idea with the current implementation.
I am pretty ignorant on such topics, but... I have already finished work on the new event data stuff which is emitted with a tkinter event when the sheet is modified by the end user, also a sheet redrawn event is emitted when it's redrawn (will push this code and docs to a branch soon)
I guess with getters and setters and a more simple way of creating and deleting the cell options stuff the internals will be more separate and everything easier to manage if more changes are made though
Once the api is more or less done I will go back to adding back in the relevant type hinting you started on also, I am slowly understanding it ._.
Shifting towards a greater seperation of roles is probably something to shoot for in the medium term. I think model/veiw pattern would probably look something like this for tksheet:
The view handles drawing all the canvas items and spawns events that inform the model about user changes. The model handles all the backend stuff related to moving columns around, editing data, determining the visible areas of the table, etc. Currently this functionality is all smooshed together inside the four main classes.
I am hesitant to use the word range in this way because of the builtin, would area or span be okay?
I think span is the best alternative.
I could put a converter in for the optional of using strings
Yeah, the string converter should probably support alpha-numeric excel-like cell coords and pythonic integer index coords.
I'm going to start work on a AbstractSpan
class tonight and implement it in a CellSpan
. For now, I will not be removing/modifying any existing features from the API. It will be developed alongside the existing code for now. I'll have a demo for you guys to play with in a few days and give feedback on.
Closing this PR for now, lets move this discussion back to PR #176 or the TODO.
Hi, I've update a few of the methods in
sheet.py
to be more builder-like. This essentially means that they can be chained together, rather than having to be listed called seperately, like the leetcode example belowThere are a few functions I haven't touched, but which I would be good to change.
Sheet.headers
,Sheet.row_index
have this kind of dual functionality where they'll either set or return the headers depending on whether you specifiy a set value. This is cool, but I think it would personally be better if this functionality were handled only by theget
and aset
functions at theSheet
level. Generally, I am of the opinion that functions should aim to do one thing and one thing only. Lets takeSheet.data_reference
for example. Currently this is just an accessor forMT.data_reference
and looks like this:So this function really does four separate things. If I provide it with
data=None
it will return me the sheet data, on the other hand if I provide it with data it will set sheet data. Furthermore, I also have the option of resetting row and column positions. Personally I think this is way too much and leads to hard-to-read and confusing code. Here is how I think this functionality should look from a user level:I also think that it is generally a good idea to keep
redraw = False
as this means that the table isn't updated until it is specified by the user, which is more performant in chained operations. I want to stress that these are changes that can be made without changing any of the background code (MT
,CH
orRI
).tksheet.Sheet
is essentially a facade for the more complex stuff in the backend, so we can have more complicated functions hiding back there but present the user with simple functions that only do a few things. In general, my recommendation is that we avoid putting confusing multi-functional, multi-output functions on the front end and front-end functions should fall into one of two categories; methods or configs that do or change something (and returnself
), andget
orcheck
functions that return other stuff.Let me know what you think. I am happy to go through and refactor the whole front-end if you are satisfied with these changes, currently I have not altered any of the functionality of the functions.