Closed idiom-bytes closed 1 year ago
I like the sentiment here.
But the scope is too large. "For all metrics".
The reason that I gave five steps before this, is that each of those is small and evolutionary, in line with what we are organically doing. They let us explore metrics quickly, locally, without rest APIs and other heavier architectures to slow us down.
Whereas this issue reads like "build a big pipeline". Before we know any metrics.
So, please do NOT do this in its current description.
My recommendation:
Build a small thing that ONLY handles the "calculate accuracy over 2000 samples". That's all. No more.
Then we can see what it looks like, and get used to it. Meanwhile we can keep iterating on earlier steps 1/2/.. in a fast organic way.
I agree that we stop adding more to pdr-ws and instead focus on simpler python-centric data flows.
@trentmc I agree
Let's talk about last_2000 accuracy and organic.
Please refer to this PR as for "this existing inside pdr-backend" and "small"
https://github.com/oceanprotocol/pdr-backend/pull/290/files
=== Item A
(1) This script could just be extended to provide a more flexible interface. Or a similar one that, that may share common, structure, and output.
=== Item B
(2) Another module inside pdr-backend/pdr_backend/analytics
could exist that serves this through a lite flask app
(3) Or. another repo could exists that imports pdr-backend
and serves this through a lite flask app
=== Item C
(4) I don't see the benefits for (3) being a different repo. Analytics (as described in ticket oceanprotocol/pdr-analytics#1) is a backend item. Further, pdr-backend is a monorepo of backend services.
(5) This might help backend coordinate better, as @trizin just finished doing a bunch of python work that would help @kdetry
Let's talk about last_2000 accuracy and organic. Please refer to this PR as for "this existing inside pdr-backend" and "small" https://github.com/oceanprotocol/pdr-backend/pull/290/files === Item A (1) This script could just be extended to provide a more flexible interface. Or a similar one that, that may share common, structure, and output.
Agreed. It needs refactoring though. I just created an issue for that: pdr-backend#296 "get_predictoor_info.py needs to be more modular, and tested".
=== Item B (2) Another module inside pdr-backend/pdr_backend/analytics could exist that serves this through a lite flask app (3) Or. another repo could exists that imports pdr-backend and serves this through a lite flask app === Item C (4) I don't see the benefits for (3) being a different repo. Analytics (as described in ticket https://github.com/oceanprotocol/pdr-analytics/issues/1) is a backend item. Further, pdr-backend is a monorepo of backend services. (5) This might help backend coordinate better, as @trizin just finished doing a bunch of python work that would help @kdetry
Let's keep it all in pdr-backend repo. Simpler, easier to manage, easier to test.
Agreed. It needs refactoring though. I just created an issue for that: pdr-backend#296 "get_predictoor_info.py needs to be more modular, and tested".
Therefore you have a choice:
I recommend that latter. Fewer dependencies, and it'd be good to have this 2000-samples thing in. (It's embarrassing to see accuracies <50%)
Hello @trentmc , it is almost done, I will share the code asap. Thank you for the description
We may need different things in future, eg. currently the get_predictor_info.py script queries based on prediction senders, but I changed the query and the flask app works with contract addresses. I copied the script and worked on it, instead of importing it, I hope it won't be an issue
I copied the script and worked on it, instead of importing it, I hope it won't be an issue
Yes, it's an issue. You should never repeat code, it greatly adds complexity. "DRY = Don't Repeat Yourself" is the # 1 rule of managing complexity.
Given your proposed direction: I recommend doing #296 first, to refactor the script into reusable tested components. Otherwise we end up with unmanageable code.
It is merged and deployed, so we can close the issue
Problem
We need a lite service that integrate across the stack in a variety of ways. As defined in pdr-web oceanprotocol/pdr-analytics#1. This is meant to be a first cut at issue #(6) as outlined by trent here and further summarized here
DoD: