Closed alexcos20 closed 2 years ago
For some UI context, tried to make the resource usage more clear, so exposing the hardcoded defaults we set. This is what we have in market v4
within the publish form, so that would be the use case to extend that, and in the end multiple services with multiple prices could be created in UI based on different resource types:
The desired compute environment is going to be selected by consumer, before ordering the service. It doesn't make any sense to have this enforced by the publisher.
It's up to the consumer. If I'm short on money, I will choose an environment with 1 cpu, and by paying less I'm assuming a longer duration of the job. And vice-versa, if I'm in a hurry and the algorithm can use multiple cpus, I will pay for an environment with 128 cpus.
There can be big problems if the consumer chooses the compute environment instead of the provider - especially for more complex machine learning algorithms.
In the case of deep learning an algorithm can be run on CPUs but in that case it can take 45x longer. So it is possible but then instead of running a day it runs 45 days and a blocked CPU for a long time is also not really that cheap. Other deep learning algorithms need a certain amount of GPU memory or they won't work properly. A configuration chosen by the consumer can be not feasibly. As there is no way to revoke a job after the consumer paid this would create problems that can only be solved manually e.g. by the provider sending back the funds to the consumer.
We could define in the DDO a list of minimal requirements and display only envs that fullfill that minimal requirements
In the current v4 setup, C2D resources, prices, flow are very unclear. How do you define compute resources? How can you have multiple environments (cpu, ram , disk setups) ?
Proposal:
C2D should have multiple environments (tech speaking, namespaces) which are exported by op-service in the root endpoint. Each enviroment has it's own characteristics. The environments object should look like::
Provider will expose this envs in it's root endpoint as well, adding providerFeeToken (defined in Provider env, because each network will have it's own providerFeeToken address (ie: USDT address on mainnet != polygon)
When publishing a compute dataset, we DO NOT specify cpu, ram, etc , only the serviceEndpoint
When publishing an algorithm, specify minimum requirements (cpu, ram, etc) , so only selected environments can be used
Consume flow:
Downside:I will pay 1 DT and 10 USD to test my algo using cheap cpu.Once my algo is tested, I will have to buy again 1 DT and pay another providerFee in order to run in a high performance envIn V4.1, we will separate the process in two, meaning that you would buy the DT to have access to data for a specific period, and purchase separate compute env for different period (IE: I buy access to data for one month. I will buy 10 mins of cheapest env to test my algo , then buy an expensive c2d env to run my algo several times. So I will pay once for the data, and multiple fees for compute)How to separate data access & provider resources:
notice the extra validUntil parameter in ProviderFees event
The logic for consume is the following:
Provider logic is the following, given a txId received from the consumer (can be a startOrder or a reuseOrder tx):