Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

Billing #65

Closed m-mohr closed 6 years ago

m-mohr commented 6 years ago

According to the opinions on the kick off meeting (see meeting notes), one important aspect of to make openEO a success is billing. After the POC we should work something out to have this included. I am trying to give it a first shot...

What it potentially could include:

  1. Topping up credits, i.e. the payment process to the back-end provider itself
  2. Checking the balance / credits (available as GET /users/:user_id/credits)
  3. Calculating the cost of an operation, i.e. an job that is computed, including costs for occupied resources and data sets. (biggest [interoperability] issue to solve)
  4. Spending credits by executing operations and using resources.

I'd suggest to not bloat the API with a payment API itself and therefore exclude (1) topping up credits. There is no open payment API anyway afaik. Therefore, we would either have to decide for a proprietary API, e.g. Stripe. or develop our own, which is out of scope. So back-end providers should have their own web services to manage user accounts and let users top up their credits. That simplifies the whole process and makes things more flexible with regards to supported payment services, corporation billing etc.

(2) should be a pretty easy task. A user can have a certain amount of credits in a certain currency. An easy response could be for GET /users/:user_id/credits:

{
  value: 120.67,
  currency: 'USD'
}

The currency would be a standardized currency code as defined in ISO-4217. This credits object would be used in several context, i.e. also when a user is informed about costs. I am wondering whether "unlimited" credits should also possible in case credits are post-paid and not pre-paid by topping up.

We should leave currency conversion up to the user. Having our own currency (openEO credits or whatever) seem to make things complicated for both back-ends and users. Back-ends would have to convert their currency to ours and users would have to convert back to their local currency. That's two steps (USD -> credits -> EUR) instead of one step (USD -> EUR). The only advantage would be that it allows easier comparison between services.

Calculating and communicating the costs (3) is probably the hardest to do. A. costs for computations, i.e. how many CPU hours are used in total. B. costs for storage C. costs for data sets D. and potentially other costs, e.g. traffic, To calculate (A), (B) and (D), a user might need to specify somehow, what resources he would like to be used for his computations. Factors could be the number of CPU cores, RAM, use of SSD, disk space, ... (C) is already fixed by the process graph given. Maybe pre-defined configurations with specific resources would be easier to be handled.

After having all that information a back-end would need to somehow specify the cost. A back-end could (I) run the process on a small subset and scale up that result and specify an amount to be paid. That would make sense mostly for batch jobs and could be specified as average or maximum cost. Alternatively, (II) it just tells the user what the influencing factors and individual costs are and leaves him to decide whether that could be a good deal or not. That's basically required for web services with on demand computations. We probably need to have both ways specified in our API. Approach (II) will probably be the default as it can be applied to all scenarios. (I) is probably an optional addition back-ends may support or not. For (I) needs to be a job-related endpoint to calculate credits, e.g. GET /jobs/:job_id/costs or we need to communicate it somehow after job creation, e.g. via GET /jobs/:job_id, but cost calculation might be too slow for this. That also means that a user has to specify all influencing cost factors during job creation, e.g. how many CPU cores he wants to use. If nothing is specified the back-end uses some default that needs to be specified somewhere. A user should have the possibility to set a maximum amount of credits he wants to spend during job creation.

(4) is then done during execution. Canceling/pausing a job or deleting a service should stop producing costs. If the users runs out of credits he needs to be informed and running jobs need to be paused in the best case. Discarding previously computed results in this case doesn't sound right. A user can query for already consumed credits by requesting GET /jobs/:job_id or subscribing to a job (GET /jobs/:job_id/subscribe). consumed_credits contains a (as of v0.0.2 yet incomplete) credits object (as specified above).

Input from the back-end providers is highly appreciated. @jdries @aljacob @gunnarbusch @neteler @mkadunc (hope I have tagged one person for each back-end)

edzer commented 6 years ago

Good thinking! Pause/resume is an interesting thought, perhaps somewhat too fancy for now.

I think (3) has to be kept simple in the API. I think it should be reduce to asking the back-end what it would cost to carry out a particular task (characterised by processing graph, an extent, and a resolution). Issues of storage, computing and data access are all back-end responsibilities.

christophfriedrich commented 6 years ago

In my opinion, the billing aspect should be kept as simple as possible since that is not really what this project is about (we're not talking about this because someone thinks it's a really cool feature, but out of necessity).

I rarely use EO services and have never paid for one, so I'm absolutely no expert - I had a quick look at the Sentinel Hub pricing, but other than that these are just my general thoughts in which I will outline what the average user thinks.

I completely agree that 1. topping up credits should be left to the back-end provider. If they want to make money from it, let them deal with how to receive that money.

I think the same about 3. calculating the costs. If someone wants my money for a service, I ask for a quote - how to figure out that quote is their responsibility. I would love an exact quote (i.e. "X dollars and Y cents"), but understand that this may be harder to give than I might think. Maybe this can be addressed by allowing fields like from and to aka min and max instead of requiring one exact quote.

The available options to be taken into account when requesting the quote should be kept extremely simple, too. Sure, the number of CPU kernels or whether to use SSDs or not might be relatively common options, but the possibilities are literally endless. For example, one could just as well think of options like "use solar power only", "if something fails I want 24/7 phone support", "fix my carbon footprint" and whatnot. This never stops and it's hard to draw a line somewhere. And: Are options like that actually a thing? My experience is that the trend in billing goes towards monthly flatrates. Sinergise simply bills 512x512px tiles (or equivalents) only (source), which are easy to calculate when given just resolution and extent as Edzer suggests. Therefore I'd argue: Leave out anything that is not mandatory! If providers DO want to offer more complex pricing models, this can still be solved outside the API (for example, they could offer user preferences that the user may change by logging into the same system that is also used to top up credits).

Regarding 4. spending credits: I too think that pause/resume would be very nice, but it makes things more complex than they need to be (at the moment). Requiring every process to be stoppable after every credit-consuming sub-operation may be not that hard, but requiring it to be resumable adds a whole new layer of complexity.

Fortunately, 2. checking credit indeed seems simple. The "unlimited" option for post-paid contracts etc. seems useful. I'd strongly argue against any kind of "openEO currency" as that is out of scope of the project. Just let the back-end provide the balance in whatever currency (or "pseudo currency") that provider works with: Euros, dollars, tiles, abstract credits... Other than being harder to compare across providers, I see no problem in that.

I hope that these thoughts are useful and would also highly appreciate feedback from the back-end providers Matthias tagged.

m-mohr commented 6 years ago

Good thoughts. I finally have a more concrete idea how we could solve this.

We need to specify what happens if the back-end uses "proprietary" credits instead of a "normal" currency that is specified in ISO 4217. Is the currency field empty or does the back-end specify a non-ISO value? Probably the last one is better, i.e. You either use a ISO 4217 currency code or a proprietary name, e.g. "SentinelHub tiles".

Regarding unlimited credits: If we use

{
  value: 120.67,
  currency: 'USD'
}

as "money" definition, how would undefined credits be represented?

Regarding 3 and the influencing aspects/options: We might just want to add an optional flag that could contain something like a "plan". Providers could have different proprietary plans that can be stated in the request. For example, a provider could have "basic" and "fast" where "fast" is more expensive, but uses multiple servers instead of one (or whatever). Then they could also have a "solar power only" plan etc. We don't need to deal with all these options, but back-ends are free to charge differently.

Additionally, I'd like to add a citation from previous discussions about this topic:

Kickoff-meeting protocol (Vienna, 12th-13th Oct. 2017):

Commercial aspects, accounting: Do we have to consider it from the beginning on? EP: Yes, but as thin as possible because it is backend problem. Should be trivial. PS: Does it have to do anything with API? Does user need to know about the accounting? WW: Yes, need to know what to pay. NG: GEE was struggling with it because it wasn’t considered from the beginning on. Bring it in early. System that estimates price from beginning on? Problem: Normally don’t know from beginning, no user would accept e.g. 100% more. Alternatively: System warns: You’ve reached € 10.000. AR: Kill then process - > problematic because money and product gone. MK: AWS kills process if there’s no progress for a certain time. NG: GEE same, if pixel shows no progress for certain time.

Also, the kickoff-meeting presentations (day 1) from all project partners has a section about requirements/expectations. Multiple partners listed billing/payment/subscriptions as major aspect to consider using openEO.

m-mohr commented 6 years ago

I tried to merge all ideas together and made a first draft for billing in the API. See the referenced commit.

GET / contains now a section about billing, which lists the back-ends currency and the plans supported.

GET /credits was renamed to GET /budget to harmonize names. It allows setting a budget limit or unlimited budget by omitting the limit entry.

Several job endpoints got properties to set a maximum budget for execution and a plan that is used for processing and billing. Jobs can be asked to return the amount of costs (previously consumed_credits) incurred.

There is a new endpoint /jobs/{job_id}/estimate to get an estimate regarding the costs (duration/money). It is expected to give the maximum amount of costs at the moment.

Billing is implemented as a completely optional feature.

I think this is a relatively simple, but pretty powerful solution.

Pause and resume are moved to issue #76.

m-mohr commented 6 years ago

Closing for now. If there are any suggestions to improve this, feel free to reopen this issue.