openedx / edx-platform

The Open edX LMS & Studio, powering education sites around the world!
https://openedx.org
GNU Affero General Public License v3.0
7.46k stars 3.89k forks source link

Remove boto, upgrade boto3 & botocore #31175

Closed jmbowman closed 1 year ago

jmbowman commented 2 years ago

edx-platform is still using versions of boto, boto3, and botocore that were originally pinned (by me) 4.5 years ago before the repo even had a constraints.txt file. boto isn't even supported anymore, and I recently stumbled into a performance problem that might just be fixed automatically by an upgrade. Please get us to reasonably current versions. Suggested sequence:

  1. Move the constraints to constraints.txt so we actually notice they exist
  2. Do any work needed to remove the dependency on boto
  3. Start upgrading botocore and boto3 in parallel; try to go just far enough to upgrade to one potentially breaking change each time.

Finding the breaking changes may be difficult, as the boto3 and botocore changelogs are almost useless lists of daily API change notifications, and we don't really use boto3 in CI. But it looks like particular care may be needed for the upgrade to 1.9.0: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/upgrading.html . And I suspect we don't use the vast majority of services covered in the changelog, so we may be ok just looking for changes related to s3 and maybe 1-2 other AWS services. Writing a doc on how to decide if it's safe to upgrade to a given boto3 or botocore release is probably a good idea, if you figure that out from this task.

### Tasks
- [x] https://github.com/openedx/edx-platform/issues/31771
- [x] https://github.com/openedx/edx-platform/issues/31772
- [x] https://github.com/openedx/edx-platform/issues/31774
- [x] https://github.com/openedx/edx-platform/issues/31775
- [x] https://github.com/openedx/edx-platform/issues/31776
- [x] https://github.com/openedx/edx-platform/issues/31777
- [ ] https://github.com/openedx/edx-platform/issues/31779
- [ ] https://github.com/openedx/edx-platform/issues/31780
- [x] https://github.com/openedx/edx-platform/issues/31778
- [x] https://github.com/openedx/edx-platform/issues/31781
- [ ] https://github.com/openedx/edx-platform/issues/31914
- [ ] https://github.com/openedx/edx-platform/issues/31955
- [x] https://github.com/openedx/edx-platform/issues/32376
awais786 commented 1 year ago

edxapp is using django-storage = 1.8

The legacy S3BotoStorage backend was removed in version 1.9. This is the PR where it removed https://github.com/jschneier/django-storages/pull/825

We can move from s3boto to s3boto3 using exiting django-storage = 1.8. One of the main thing is handling exceptions.

s3boto is using these exception = https://github.com/boto/boto/blob/2.39.0/boto/exception.pys 3boto3 using these  exception =  https://github.com/boto/botocore/blob/1.8.17/botocore/exceptions.py

Plan:

  1. We will add both exceptions simultaneously. Then replace all codes occurrences from storages.backends.s3boto import S3BotoStorage with from storages.backends.s3boto3 import S3Boto3Storage and update the exceptions also and handle both boto and botocore exceptions. ( merge and deploy this )
  2. Next step is create PR in internal and deploy it.
  3. Remove all boto usages and exception and deploy it.
  4. If every thing goes as per plan we can upgrade django-storage to 1.9 as per documentation.
iamsobanjaved commented 1 year ago

There are multiple (6+) pipelines to upgrade from the boto-based backend to the boto3-based backend. We can try to upgrade pipelines in an incremental approach, try merging 2 pipelines at once so we can verify with ease on stage before pushing our changes to prod.

awais786 commented 1 year ago

https://django-storages.readthedocs.io/en/latest/backends/amazon-S3.html#migrating-from-boto-to-boto3

awais786 commented 1 year ago

We have successfully upgraded the django-storages to 1.9.1. Now only videos module left with boto import.

For further upgrade to 1.10.1 https://github.com/jschneier/django-storages/blob/master/CHANGELOG.rst#1101-2020-09-13 we need to change few settings variables and also sets the boto3 warning.

awais786 commented 1 year ago

No usage found

UsamaSadiq commented 1 year ago

Usage found in

These following three repos are interlinked and using python2.7. So not possible to update with boto3.

awais786 commented 1 year ago

@jmbowman so far found usage here. I have created issues in these repos except luigi.

jmbowman commented 1 year ago

There's a "boto usage" filter view in the Repo Health Dashboard spreadsheet that shows just repos with "boto==" in the dependencies.pypi_all.listfield; it currently includes 22 repos. Some of them bring it in due to pinning to a stale version of moto, such as https://github.com/edx/video-encode-manager/blob/master/requirements/constraints.txt#L7-L8 .

Also, there seems to be a bug in the implementation of the requires.boto check. It's meant to be an easier way to find these repos, but it incorrectly has False for most of them.

awais786 commented 1 year ago

Lots of entries are coming from xblock-sdk. I have already fixed that just need to merge and release version.

awais786 commented 1 year ago

https://github.com/openedx/openedxstats/pull/209/files PR created for openedxstats https://github.com/openedx/openedxstats/pull/211 using moto for tests.

awais786 commented 1 year ago

We have successfully upgraded the django-storages to 1.10.1. Now next target is 1.11.1

awais786 commented 1 year ago

django-storages upgrade to latest version in edx-platform. I think we can close this PR now. Boto still in use in videos module. I have already informed owning team. They have following ticket https://2u-internal.atlassian.net/browse/TNL-10698