tdaverse / ggtda

ggplot2 extension to visualize persistent homology
https://tdaverse.github.io/ggtda/
GNU General Public License v3.0
20 stars 5 forks source link

first CRAN submission #55

Open corybrunson opened 5 months ago

corybrunson commented 5 months ago

We are close to our first submission to CRAN. I'd like to check with everyone on a few points first:

  1. Does the package look complete, with respect to tests, examples, and vignettes?
  2. Are there any open issues that you think should be resolved before submission?
  3. Do you have any other concerns that should be discussed first?

Thanks! I'll tentatively plan to submit at the end of April, but we can take what time we need to address anything that comes up.

jamesotto852 commented 5 months ago

I feel positive about your questions 1 and 2 and think the package could be submitted now; it covers many use-cases and works well. But, I see one potential change/improvement that would be pretty breaking if made later on:

We have (right now), three ways of visualizing persistence. Generally, these are implemented to take in start/end data via the Stat implementations and either pass to a {ggplot2} Geom or paired {ggtda} Geom. An exception is the dataset aesthetic in StatPersistence. I think there's the potential for a relatively painless refactoring here, we could implement a single StatPersistence that always takes the dataset aesthetic and Geoms for each type of visualization which always take the start + end aesthetics. This would have many advantages: it would be a standard API for users across persistence plotting functions which aligns more closely with {ggplot2} opinions, it would reduce duplication in the codebase, it would make implementing other styles of persistence visualization much easier (#54). The main disadvantage I see is it somewhat pushes the use of the dataset aesthetic but I don't really see a way around that -- I think we'd need to ramp-up the dataset aesthetic documentation slightly (I'd be happy to author a vignette on it). Of course, users would always have the option of pre-processing and passing into the corresponding Geom's similar to the current workflow with the Stat's (note, pre-processing and using a Geom is a more traditional {ggplot2} workflow).

I'd be glad to start work on this over the next week or so and submit as a PR, I actually did a really similar refactoring of {ggdensity} at the same stage in development (interestingly, the {ggdensity} refactor was due to comments from an R Journal reviewer) and feel I could do this pretty quickly. The current code is really good, I'd mainly be moving things around the different methods of the Geoms and Stats. I think we could even keep all of the current unit tests with just a few small changes/additions.


A few smaller notes:

corybrunson commented 5 months ago

Thanks a lot @jamesotto852. I apologize for waiting so long to reply. If you still have the bandwidth, i think the best course for your major comment would be to implement it in a branch as you suggest. I'm not clear on the architecture you have in mind, and i think trying to talk through it might be less useful than seeing even a rough version of it.

I'll take a look at the vignette and make the edit if i agree (or try to explain it if not). Since your second minor comment ties in to your major one, let's see what happens in the fork before addressing it.

Given the delay on my part, do you have an idea of when a draft refactor could be demoed and talked through? I won't be available until after next week, and after that it'll be touch and go due to travel.

jamesotto852 commented 5 months ago

No worries! I'm happy to implement this in a branch, I think I could have a functional demo ready pretty quickly. If I'm understanding your schedule correctly, it sounds like you'd be available to meet sometime the week of the 13th? If so, I can aim to have it ready then to walk you through it. At that point, my proposed architecture should be easier to communicate and we could make a decision about if we want to take the package in that direction.

corybrunson commented 5 months ago

Let's plan to meet during May 17–22. I'll be at a conference but unusually stationary and able to step away with a laptop.

jamesotto852 commented 5 months ago

Sounds good! I've been working on the implementation and feel sure I'll have it ready to go through by then. My schedule is pretty flexible that week (except for the 23rd), when we get closer let me know when would be best for you.

corybrunson commented 5 months ago

OK, i'll set a reminder to myself to suggest some good time slots the week before.