insightsengineering / nestdevs-tasks

0 stars 0 forks source link

Refactor the data class in teal #41

Closed donyunardi closed 10 months ago

donyunardi commented 1 year ago

Summary

We're improving the teal's data class (tdata, TealData, etc) and how it's being used in teal framework. Read the milestone description here

1. Introduction of teal_data

Target: done by first week of sprint 1 (1 week)

2. Task to do after the foundation code is merged:

Target: done by end of sprint 1

CHECKPOINT

sprint 1

3. Introduction of ddl

Target: done by the end of sprint 2

4. Introducing teal_data to the modules

Target: start this part by in sprint 3

Replacing data argument from tdata to teal_data will introduce breaking change. Problems in teal.transform can be mitigated by making "patches", this will give us possibility to release teal even if teal.transform isn't a final version.

Q: Instruction on migration, should we write vignette (static) or other platform, i.e. GitHub Discussion? A: We will use GitHub Discussion https://github.com/insightsengineering/teal/discussions/945

Q: We have functions (or PR) to support upgrade and downgrade of teal_data. Does this mean we need to keep old classes (i.e. TealData)? How long do we want to support these functions? A: We're not supporting the old ways. As discussed and agreed upon, the next release will be breaking change.

5. Release teal packages to CRAN

Target: start releasing in the beginning of Jan 2024

Breaking changes should not occur once sent on CRAN.

UPDATE: Release of teal frameworks (not module), will be done here: https://github.com/insightsengineering/nestdevs-tasks/issues/42

Related issues for later

Addressing step 1-4 automatically closes:

Definition of Done

donyunardi commented 1 year ago

@insightsengineering/nest-core-dev

Release teal.data internally once we introduce teal_data: https://github.com/insightsengineering/teal.data/pull/171 We need to do this before CRAN release.

We need to rethink this.

teal.data is a blocker for other teal framework packages so teal.data has to go to CRAN first. If we release to CRAN after the refactor, that would take too long.

I thought of couple options:

Option 1

  1. Provide soft-deprecate message on related functions.
  2. Release to CRAN and internal
  3. NOT merge https://github.com/insightsengineering/teal.data/pull/171 to main but instead to refactor branch.

Option 2

  1. Merge https://github.com/insightsengineering/teal.data/pull/171 to main.
  2. Provide soft-deprecate message on related functions.
  3. Release to CRAN and internal

Option 3 (what's currently suggested)

  1. Merge https://github.com/insightsengineering/teal.data/pull/171 to main.
  2. Provide soft-deprecate message on related functions.
  3. Release to internal only
  4. Finish the refactor branch
  5. Remove all soft-deprecated functions
  6. Release to CRAN and internal

I think option 3 is quite risky and could affect our CRAN timeline. I think option 1 would be ideal.

Please let me know that you think.

m7pr commented 1 year ago

I think it would be really awkward to upload teal.data to CRAN as a first official release and have deprecation messages already. If we do not think 3 is feasible in the timeline we hoped it go get done, then I would suggest

  1. uploading current main to CRAN without the soft-deprecation messages,
  2. uploading another version to CRAN with soft-deprecation messages once the refactor is merged,
  3. uploading another release after another 2 months with a deletion of deprecated functions, so that users have a time to realize some part of functionality is being removed
kartikeyakirar commented 1 year ago

I agree with Marcin on the initial deprecation message release, also believe in providing users ample time to adapt to the changes for a smoother transition.Option 1 with approach is to first upload the current main version to CRAN without the soft-deprecation messages, and then follow up with another version containing the deprecated messages, which I believe would be an improved approach.

vedhav commented 1 year ago

Now for some unpopular opinion. The two problems that I see are:

  1. Not being able to release old + new compatible release on GitHub followed by the new only release on CRAN because creating deprecation will take time.
  2. Releasing CRAN with deprecated functions/methods

What if? We can take a release branch and just merge the new changes without compatibility and release on CRAN and come back to do the old + new compatibility release on the main branch on GitHub followed by quickly updating the GitHub to the CRAN version. Not sure if this is feasible, but if it's possible my vote will be for this. Otherwise, I'm good with Marcin's proposal.

P.S While typing this I see how insane this scenario is

m7pr commented 1 year ago

I don't understand this proposition

vedhav commented 1 year ago

Nevermind, I was just thinking out loud. We scraped my proposition in the daily :)

donyunardi commented 1 year ago

@m7pr makes the most sense to me. Let's discuss and make decision next week when the whole gang is in.

lcd2yyz commented 1 year ago

Although I perfectly understand the struggle of "don't want to show my code to the world unless it's the best version", the "cost" seems too high to justify Option 3.

@m7pr's proposal looks optimal to me. This can also help to largely de-risk our upcoming PI, or any foreseen/unforeseen impact from the major refactor. It will allow us to keep the target of releasing the whole teal framework by end of year.

m7pr commented 1 year ago

We still have this week + next 2 weeks before the next increment starts. Maybe we can merge refactor within 3 weeks, and then we can release the final version that is ready and polished? I also think the refactor was non-committed goal/milestone for this increment so I dont think we should struggle to release this week if we have major conerns

chlebowa commented 1 year ago

My opinion is probably deeply ignorant but I don't think adjusting refactor pace to release schedule is the right way to go. I think it's better to put some effort in the refactor (which I firmly believe we can finish by the end of the year) than release prematurely. Having deprecation warnings in the first (CRAN) release may be a tad confusing and may undermine user confidence in the product as a whole.

m7pr commented 1 year ago

If there refactor finishes at the end of the year, then I think it's fine to release current main (without deprecation messages) so that we can unblock the process of releasing other packages in the line.

gogonzo commented 1 year ago

I suggest to go with initial plan:

  1. Release on GitHub with soft deprecation
  2. Immediately after GH release we remove deprecated stuff on main and release on CRAN.

Thanks to above teal.data won't be postponed.

  • uploading current main to CRAN without the soft-deprecation messages,
  • uploading another version to CRAN with soft-deprecation messages once the refactor is merged,
  • uploading another release after another 2 months with a deletion of deprecated functions, so that users have a time to
  • realize some part of functionality is being removed

Releasing long awaited package to CRAN and making API changes soon after is not a good idea. We will have to support users using old version and new version

m7pr commented 1 year ago

Thanks @gogonzo for the message. I just dont understand what you want to

  1. Release on GitHub? The current main? Or what? Or the refactor once it's merged?
  2. By deprecated stuff you mean messages, or the whole functionality?

Releasing long awaited package to CRAN and making API changes soon after is not a good idea. We will have to support users using old version and new version

Totally, but we will support this only by some small period of time.

gogonzo commented 1 year ago

Totally, but we will support this only by some small period of time.

I bet someone from NEST will announce "teal.data is on CRAN" and we will have a problem.

m7pr commented 1 year ago

got it. agree that we should release the final product to prevent supporting both versions and prevent people becoming discouraged becuase they need to learn new version, once they invested the time and the energy to learning the first version which is not finished

donyunardi commented 1 year ago

We re-discussed our strategy again and updated the original comment with the new agreed release plan and timeline. The main changes is to not perform any internal release and force breaking change for modules.

donyunardi commented 10 months ago

Refactor is completed. Closing.