bedatadriven / activityinfo-R

ActivityInfo R Language Client
https://www.activityinfo.org/support/docs/R/
17 stars 12 forks source link

Fixes #66. Use direct upload to GCS to avoid import limit. #70

Closed akbertram closed 1 year ago

akbertram commented 1 year ago

This updates the importTable() and stageImport() functions to use a signed URL to upload the data to be imported directly to Google Cloud Storage. I have left this as an option because it many users in Syria, Iran, etc are blocked from accessing storage.googleapis.com directly because of sanctions. (Our users are UN agencies and NGOs who are not subject to these sanctions!)

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 57.14% and project coverage change: -0.21 :warning:

Comparison is base (556deba) 67.68% compared to head (036c0a9) 67.48%.

:exclamation: Current head 036c0a9 differs from pull request most recent head f85090b. Consider uploading reports for the commit f85090b to get more accurate results

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #70 +/- ## ========================================== - Coverage 67.68% 67.48% -0.21% ========================================== Files 13 13 Lines 1176 1181 +5 ========================================== + Hits 796 797 +1 - Misses 380 384 +4 ``` | [Impacted Files](https://codecov.io/gh/bedatadriven/activityinfo-R/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=BeDataDriven) | Coverage Δ | | |---|---|---| | [R/import.R](https://codecov.io/gh/bedatadriven/activityinfo-R/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=BeDataDriven#diff-Ui9pbXBvcnQuUg==) | `47.19% <57.14%> (-0.66%)` | :arrow_down: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=BeDataDriven). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=BeDataDriven)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

nickdickinson commented 1 year ago

Thanks @akbertram ! Do we need some kind of pagination? I'll merge this shortly.

akbertram commented 1 year ago

Thanks @akbertram ! Do we need some kind of pagination? I'll merge this shortly.

Pagination is not relevant here. This allows the R client to upload the data to import directly to Google Cloud Storage (limit 5 TiB). The file is then accessible to the import job which reads it from google storage.

akbertram commented 1 year ago

For the regex, should you maybe add ^ and $ to make sure the appspot url is actually your instance? "www.activityinfo.org||appspot.com"

Do you want to add a test? Should the preproduction server also be added?

If it's hosted on appspot.com, then it's running on AppEngine, which is not possible for the self-managed version. The pre-prod environment also matches this pattern.

nickdickinson commented 1 year ago

And is this supposed to be merged into the main branch or v4.33?

akbertram commented 1 year ago

And is this supposed to be merged into the main branch or v4.33?

Sorry, should have been 4.33