dj-stripe / dj-stripe

dj-stripe automatically syncs your Stripe Data to your local database as pre-implemented Django Models allowing you to use the Django ORM, in your code, to work with the data making it easier and faster.
https://dj-stripe.dev
MIT License
1.56k stars 476 forks source link

[Call for contributors] We need a migration path for old-style to new-style JSONFields #1820

Closed jleclanche closed 9 months ago

jleclanche commented 1 year ago

Hi folks

dj-stripe 2.8.0 will drop support for old-style JSONFields - that is, jsonfields that did not use postgres special JSONField or django's native JSONField. See here for the deprecation notice.

Blocking the 2.8.0 release will be a migration path for those poor souls stuck on the old JSONFields. These are text fields, and need to be converted to the native json format.

Ideally we would do this in the migrations themselves but I do not believe it to be possible (without breaking things for people who were on new-style jsonfields). A one-off management command is probably preferable.

Contributions highly welcome as I suspect this will otherwise be a long term blocker for that future release.

--

Details from #1566:

Follow up from discussion on #517.

A portion of the installed dj-stripe userbase is using legacy-style text jsonfield columns. We want to migrate those users over to the "new" Django JSONField columns.

Thing to keep in mind:

  1. Some users are in that scenario, and some others are already using new-style JSONField columns.
  2. It is a bit opaque to users which scenario they are in, and we may want to offer them an easy way to check, or warn them if they try to run the migration unnecessarily.
  3. The migration is going to be very expensive to run, so we should probably run it from a management command via custom SQL, instead of as an actual django migration.
  4. We should not allow the migration to go wrong half-way through. If we run it ourselves, we want to do so in a transaction.
  5. Different database engines will process such a migration differently. We need to identify the required SQL in each engine we support (which is only mysql/mariadb, postgres and sqlite). I'm not sure sqlite needs a migration at all.
  6. There is a large amount of JSONField columns affected. We should probably enumerate them via model introspection.
czue commented 1 year ago

@jleclanche I'm game to pitch in on this or be a guinea pig if you are looking for help. Would love to have a path for this migration.

patroqueeet commented 11 months ago

Does have anyone done this eventually manually and can share some code?

kavdev commented 9 months ago

Does have anyone done this eventually manually and can share some code?

@patroqueeet Yep. You need to write a manual migration that runs sql: https://github.com/dj-stripe/dj-stripe/issues/517#issuecomment-325091306. Note that those fields were from an older version. Your mileage may vary (i.e. you might need to add/remove some).

jleclanche commented 9 months ago

Unfortunately, this is not happening. We have decided to break BWC for 3.0, so I'm closing this.

czue commented 9 months ago

@jleclanche That's a bummer. Is it that it is just too technically difficult to write the migration script? Perhaps something that could at least be documented with some appropriate caveats? E.g. it would be useful to know from this comment what the "some scenarios" are in "it should also only be run in some scenarios".

Also just a clearer sequence of steps would be helpful. Something like (not sure if this is right):

  1. Install dj-stripe 2.7.x
  2. Ensure DJSTRIPE_USE_NATIVE_JSONFIELD is set to False
  3. Run this migration. Note it might be slow, etc.
  4. Set DJSTRIPE_USE_NATIVE_JSONFIELD to True
  5. Confirm you didn't break anything.
  6. Upgrade dj-stripe to latest.
jleclanche commented 9 months ago

@czue It's not just this issue. It's a lot more, and especially no longer supporting the old-style integer IDs (which is reflected in all the foreign keys), as well as massive model changes planned next version so that dj-stripe stops breaking BWC every single release.

There's also this: dj-stripe tables are essentially cache tables for Stripe's upstream data (with maybe the exception of the djstripe_event table which retains data with no ttl, unlike Stripe). So wiping & re-downloading the data is safe. And in fact, we found this fixes some issues with bad data; mysterious crashes filed elsewhere on this tracker.

I'm trying to be pragmatic… concretely, we haven't had well-tested contributions that fix this issue in a way compatible with mariadb, which makes it difficult for us to support this migration path at all.

czue commented 9 months ago

@jleclanche thanks, and I appreciate the balance you have to maintain in terms of being able to keep the library working well and efficiently moving forwards versus supporting legacy users. I'm certainly not trying to change your mind, just trying to see what a good path forward is for people in my situation of having an app on a legacy version.

It sounds like you're saying the recommended approach is basically:

  1. Nuke all your dj-stripe data - preserving whatever information you need to ensure you still have references to the stripe models you need in your own foreign keys.
  2. Upgrade dj-stripe
  3. Sync your data back
  4. Restore the foreign keys, etc. to the new djstripe models

Does that sound right? When I get to upgrading my app, I can attempt to run through it and perhaps provide some patterns for doing this.

dustinblanchard commented 8 months ago

@jleclanche do y'all have an estimated date for the release of the next major version? This will still impact us if we are already on 2.8+, correct?

We are working on some additional features that will have foreign key relationships to dj-stripe models, so I'm trying to decide if we need to wait for the release.

jleclanche commented 8 months ago

Hey @dustinblanchard!

If your foreign keys are using non-integer IDs but indeed the string IDs (which are the same as stripe IDs), there will be no database-level migration necessary. Code changes and some migration fakery may be required, but nothing that will impact your data, as the data in the columns will be the same.

The constraint themselves could possibly change, as the table names may change, but this is something where we will try to retain the same table names so as not to break anything unnecessarily. Still even that would be a very easy change, if it were to happen.

As for an ETA, I don't have one, and it may be months away still depending on how things evolve.

Feel free to reach out to jerome@goveni.eu if you need some consulting time on this. I'll be offering up free help to anyone affected by our breaking changes in that release, as I realize it's a significant issue for our users.

dustinblanchard commented 8 months ago

@jleclanche thank you, this is great! I think we should be good for now with the stripe IDs, but will reach out if we run into any issues. Let us know if you need any help with migration testing / use cases when the new versions are ready.

jleclanche commented 8 months ago

@dustinblanchard really appreciate it. Yes I will absolutely need help with it. Please feel free to pick a time on my calendar, I'd love to do a quick q&a of your usage of dj-stripe: https://cal.com/jleclanche

onekiloparsec commented 1 month ago

It sounds like you're saying the recommended approach is basically:

  1. Nuke all your dj-stripe data - preserving whatever information you need to ensure you still have references to the stripe models you need in your own foreign keys.
  2. Upgrade dj-stripe
  3. Sync your data back
  4. Restore the foreign keys, etc. to the new djstripe models

Does that sound right? When I get to upgrading my app, I can attempt to run through it and perhaps provide some patterns for doing this.

I am about to migrate to 2.8, and as far as I understand, this approach outlined above is the best one, right? When talking about the foreign keys, it means any possible custom Django model having a foreign key on a DJstripe model, right?

Moreover, is there any "easy" way to whipe up all dj-stripe data? No risk of deleting any Stripe data? I am a bit worried, although I mostly rely on only two models: Customer and Subscription.

Any additional hints would be greatly appreciated, thanks!

jleclanche commented 1 month ago

The easiest way of wiping the djstripe data is to truncate all the tables beginning with the djstripe_ prefix. I can tell you, however, that djstripe is OK to be operated this way; syncing the data back from stripe is the "painful" part (as in, it takes time) but other than that it's easy.

This is NOT required for 2.8 migration. All migrations in the 2.x->2.x+1 path are safe to run without wiping data.

onekiloparsec commented 1 month ago

Thank you @jleclanche for your input. You say basically I can run smoothly the migration from 2.7.3 to 2.8.0 without wiping out all the data. But today I have DJSTRIPE_USE_NATIVE_JSONFIELD = False, that's why I was asking how to proceed.

Sorry for double-checking, I am always very cautious with DB migrations.

jleclanche commented 1 month ago

You can try the migration path here but yeah you'll have to be ready to go with a wipe.

I would suggest making a copy of your db and running the migration on the copy first.