jmw86069 / venndir

Venn diagrams with directionality (concordance), optional display of items inside the figure, text Venn diagrams
https://jmw86069.github.io/venndir
Other
6 stars 0 forks source link

Migrate from sp/rgeos to polyclip #5

Closed jmw86069 closed 2 months ago

jmw86069 commented 9 months ago

The latest update to the sp and rgeos R packages did in fact remove rgeos from CRAN. Despite a year advance warning, it was not easy to perform a drop-in replacement with the sf package, in part because until recently our linux system was unable to install sf and its myriad dependencies. Something about requiring C++ newer than was available, then something about stdclib needing to be more recent, etc. etc.

The conclusion was to use polyclip instead of sf to fill the same roles: buffer outside/inside a given polygon; distance from point to polygon, nearest polygon, things like that. These functions are available in polyclip and do not require installation of extensive geographic mapping libraries (as with sp and now sf). Also, polyclip is available on older versions of linux.

The task is to migrate venndir functions to use polyclip, thus removing all dependency on sp and rgeos.

LPotter21 commented 7 months ago

Hi James, First of all, thank you so much for this package, I have used it extensively over the past several years. In particular, the directionality options have been so useful when comparing RNA-seq results across analyses.

Is this migration still an ongoing process? I uninstalled/reinstalled venndir and am still getting the GEOS errors. Just want to make sure it's not a "me" issue in regards to R or other package dependency versions. Thank you again, Luke

jmw86069 commented 7 months ago

Thank you very much for your comments!

The migration is still in progress, and I hope to have the new version available this week. The major functions have been transitioned. Apologies for the downtime, for what it's worth I'm having troubles without venndir also!

jmw86069 commented 6 months ago

Progress: Most functions are ported to polyclip equivalent. Ran into a few snags testing edge cases, the polyclip polygon format sometimes returns list with x,y and sometimes a list with list of x,y, depending upon presence of holes, or multiple disjointed polygons.

Also plotting polygons has three basic options: graphics::polypath() for base R graphics, ggplot2::geom_polygon() for ggplot2 output, or grid::grid.path() for grid output. They have slight differences in data handling:

I may change all output to grid, to simplify the plotting steps. (So I don't have to handle so many polygon data formats internally.)

grid output would natively use gridtext::richtext_grob(), it is compatible with patchwork for complicated layouts, and it offers some cleaner ways for calculating label widths in context of the output device.

I'm avoiding the temptation to write a "simple little geometry package" to handle things... just trying to make things work here first.

All that to say, hopefully later this week. I also really need this working again.

jmw86069 commented 5 months ago

Clearly this was a larger effort than anticipated. In development, I have functional venndir() which uses polyclip equivalent functions for many tasks.

Interestingly, I found it helpful to create a new S4 object JamPolygon (for internal use at this point) because polyclip likes to store polygons in two formats: list(x=..., y=...) for simply polygon, or list( list(x=..., y=...), list(x=..., y=...) ) for complex polygons - including polygons with internal holes, and/or multi-part disjoint polygons.

I found I needed to implement methods to determine which multi-part polygons were "holes" or "solid" - even recognizing nested polygons (a polygon inside the hole of another polygon). Also implemented point_in_JamPolygon() which detects whether a point (or series of points) are inside the solid portion of a polygon.

I implemented label_fill_JamPolygon() to display items inside the solid portion(s) of a polygon; buffer_JamPolygon() to create internal/external buffer region around the solid portion(s) of a polygon. The label fill is currently only rectangular, though I plan to add other offsets to minimize label overlaps.

I implemented plot.JamPolygon() (also generic plot()) with some improvements. All output will use grid with no base R graphics, and no ggplot2 (currently). It should be much more usable - multiple figures could be assembled with patchwork or other grid friendly methods.

It also understands the solid/hole polygons and encodes that information for polyclip as clockwise/counterclockwise points as needed, or for vwline::grid.vwline() which renders a border on the inner/outer side of the polygon and needs to know the direction the line is drawn to know whether right/left is outer/inner, and solid/hole. It works. (Truth be told, I've been wanting a more robust method to draw a polygon with outer border and inner border. Adjacent polygon borders can be seen side-by-side by using innerborder and they will not obscure each other.)

Still todo:

jmw86069 commented 4 months ago

I have a version I am testing offline which uses polyclip and introduces two object types: JamPolygon, Venndir.

The minimal "final steps" to release is to clean up the Venndir object, and provide accessors/examples of accessing the data. When drawing a figure, accessing its data is not critical, but for me a common next step is to retrieve the items in each Venn overlap for follow-up use.

Unfortunately I did not create a Github branch prior to making changes (rookie mistake). Maybe I should move changes into a branch so I can commit and push updates, and allow testing from that branch, before merging to the main branch?

jmw86069 commented 2 months ago

I believe the core goals are accomplished.

All reference to sp, sf, rgeos are removed, all related functions were also removed. New functions were created only as needed for JamPolygon.

Internal functions now accept and return Venndir objects instead of the various types of list components. There are some references to polygon_list which is a sort of partial implementation of what polyclip accepts - however there should be no user-facing polygon_list use cases.

The three unchecked boxes above (optional rescale_JamPolygon(); enable shadowText; option to return grid::gTree) are not core to this issue, and will be treated as new features for future prioritization.