dave-howard / vsdx

vsdx - A python library for processing .vsdx files
BSD 3-Clause "New" or "Revised" License
67 stars 25 forks source link

Review API presented by VisioFile class in readiness for v1.0.0 release #22

Open dave-howard opened 3 years ago

dave-howard commented 3 years ago

The API has evolved a bit organically and as such is not currently very consistent.

Most methods today are not marked as private (not underscore) but many are not really aimed at end users of the package.

If we think about how each object is likely to be used we are likely to be able to a inituitive experience for user/developers.

For example - being consistent with how a user finds, moves, copies, removes a Shape or a Page

Ideally we should have a well defined API before we move to v1.0.0 release so we can avoiding breaking changes later.

stevensultana commented 3 years ago

Been a while since I checked in.

I'd add two items to this task:

Or whatever makes sense: maybe connectors are closer related to Shapes, for example.

I also have a bit of an annoyance, but it's quite a breaking change: Page.shapes returns a list of 1 item, but the spec says there should be zero or 1 shapes container in a Page, so the list is not intuitive. Further to that, I think the user's intuition would be that Page.shapes() returns a list of Shapes not a shapes container. I'd say, actual_list_of_shapes = page.shapes

is much friendlier and intuitive than actual_list_of_shapes = page.shapes[0].sub_shapes()

That's off the top of my head. I'll try to review the code this weekend and give my opinions on the questions you raised, just to provide a different perspective :)

dave-howard commented 3 years ago

that is very much in line with my thinking too - and now is the right time for those API breaking changes

dave-howard commented 3 years ago

one other quick thought on how we break down the classes/files - I would want the import statements to be simple and descriptive - none of these deeply nested imports you sometimes see. I am sure that's doable regardless of how we arrange the source files internally.

stevensultana commented 3 years ago

Mostly a train of thought: A VisioFile contains Pages (and MasterPages) A Page contains Shapes and Connectors Shapes are made up of Cells (properties) and subshapes Connectors in the vsdx file refer to the connector ID, and one Shape to which they connect. So a "full" connector is made up of 2 Connector elements.

The corollary of this is that:

The exception to this sweeping statement are the jinja functions, which I did not study properly and might very well have their home in VisioFile.

The object users will probably want to interact with the most is the Shape and then the Connector.

Functions related to changing shapes should be (exclusively) in the Shape class. So that would mean that the following functions should be removed from VisioFile:

For VisioFile.apply_text_context(), it's a rather specific use-case but it could remain, a bit differently. VisioFile.apply_text_context() would call Page.apply_text_context() for all the pages, which in turn calls Shape.apply_text_context() for each shape.

Maybe the increment_shape_id() functions could be moved to be at the Page level (pretty much all of them have a page parameter)

VisioFile.copy_shape() I think should be at the Shape class only (Shape.copy already simply calls VisioFile.copy_shape). VisioFile.insert_shape() I think can be moved to the Page class. page.insert_shape(new_shape) feels intuitive. Especially when we get to a point where the user can create shapes from scratch.

Possible breaking changes: Shape.text: eventually, we should get to a point where changing a shape's text (eg. with find_replace) preserves the text's formatting.

Setting a shape's properties should update all cells which use that property in their formula. This is especially important for width and height, but other properties might also be referenced in this way.

Adding connectors: I think adding connectors is an important feature that should be developed before a v1 release.

stevensultana commented 3 years ago

Regarding the project structure, I agree that it should be as flat as possible. Maybe something like this (inspired by https://docs.python-guide.org/writing/structure/):

top directory
  |
  |-- vsdx
  |    |
  |    |-- __init__.py
  |    |-- visiofile.py - includes VisioFile class and helper functions
  |    |-- shapes.py - includes the Shape and Cell classes
  |    |-- pages.py - includes the Page and Connector classes
  |    +-- (etc)
  |
  |-- tests
  |    |
  |    |-- visiofile_test.py
  |    |-- shape_test.py
  |    |-- page_test.py
  |    +-- (etc)
  |
  |-- docs
  |    |
  |    +-- (pretty much as is)
  |
  |-- setup.py
  +-- (requirements.txt etc)
stevensultana commented 2 years ago

Hi Dave, fwiw, I'm trying to understand what would be required if we split the project into multiple files. I have a branch at https://github.com/stevensultana/vsdx/tree/split_project_files to work on this, if you want to have a look yourself :)

Until now, I encountered some issues with circular dependencies, but managed to wrangle them a bit with ordering the imports. Still quite a brittle solution, though.

stevensultana commented 2 years ago

Hi Dave, do you have an API design in mind? I'll be happy to whip something up for your review :)

dave-howard commented 2 years ago

HI Steven - I looked at your proposed changes in pull request. The init.py file is the surface of the API that consumers of the package will be using; and I think this allows easy access to those classes they shoud be using like VisioFile, Page, Shape etc. and we can always add more if need be, and this doesn't stop the consumer from importing other nested items they might want to. With regards to circular imports and the conditional/late imports to mitigate - we can import vsdx and reference these as e.g. media = vsdx.Media(). That seems to work fine for our test suite - and should be fine for a package consumer also I think. I've added a Geometry section in latest release - so I will release that to PyPI and then will merge your pull request and make a start on integrating what you have done here with a few tweaks to mitigte the circular imports.

dave-howard commented 2 years ago

Didn't mean to close this. You should see I have merged pull request and resolved conflicts - all tests are passing but please have a look at flag any issues I might have introduced whilist combining.

dave-howard commented 2 years ago

@stevensultana ...after a long pause... my thinking on this is:

I will start on those changes - please add any more thoughts here

Thanks Dave

dave-howard commented 2 years ago

sub_shapes() would ideally be a property instead of a method - changing to property would be a breaking change so we can introduce a new property such as child _shapes to Page and Shape class, and deprecate sub_shapes() method. Note task added to list above...

dave-howard commented 2 years ago

Another item to be done prior to v1.0.0 - improve the documentation