Closed sauloperez closed 5 years ago
We're currently monitoring UK, FR and Katuma but I'd also like to add other smaller instances finding the balance of ease of management (all instances using the same services) and pricing. As for AU, it was agreed in https://github.com/openfoodfoundation/ofn-install/issues/273 that it'll stick to Wormly instead of Datadog. IMO instance that don't use ofn-install fall off the list.
If we go up to 5 hosts that is 90$ per month, whereas if we stick to 3 hosts: UK, FR, Katuma that is 54$. Both amounts are totally affordable IMO.
If we paid annually we would save 3$ per host per month. Is it worth the cost of being locked with Datadog for a year? Chances are that we won't move from it in a year though. Another option is to pay as we go but that feels really scary to me. I guess it's more an accounting/management question.
As we did with Bitwarden, I'd like this to be paid by Katuma as a contribution to the global pot.
thoughts @myriamboure @Matt-Yorkley @mkllnk @luisramos0 ?
What about data retention with these prices?
You know I'd put this money right way into metal... you can almost double all those 5 servers' capacity with this money #performance
Anyway, it will be nice to have this data!!!
I know @luisramos0 but what's the value of such metal if have no idea what is going on under the hood? It's like having a powerful car without any idea where to drive it to.
What about data retention with these prices?
That comes with 15-month data retention.
What I see now is that we might need to pay for the Application Performance Monitoring product to get the Delayed Job integration. I just sent an email to customer support to have a clear answer. If that was the case, it'd be extra 31$ per host per month. IMO we could live without it as long as we do get the PostgreSQL integration.
we have a 30secs page load time currently on shops list, map and the backoffice for super admins on the main instances is a total joke. it's a tricky discussion, I understand you see other priorities, that's why I only shared what I'd do: buy metal. imo we are not doing correct infrastructure capacity planning.
I understand you see other priorities
Not at all. We do have the same ones it's just that we see other ways to reach the same goal. I think I've shared this screenshot several times but how will more metal on top of the one that is already underused change the situation? See UK's production stats below.
With the current specs they experience regular downtime not to mention the 1h 40min OFF had with the same specs, for which I wrote a postmortem.
I would like to see how throwing metal at this could change the situation but I don't. That would ideal as we could focus our efforts on other things.
uptime and performance are two different priorities.
I have seen metal help with uptime even when the data is looking normal as those images show (load average of 3 for 4 cpus is something, it's not like the server is sleeping).
but it can also be a bug.
in the past I have used monitoring data to detect problems, very rarely to fix problems.
uptime and performance are two different priorities.
Absolutely. while I stopped working on the second to stick to the priorities until we have the gathering I don't think we can afford not caring about the former.
in the past I have used monitoring data to detect problems, very rarely to fix problems.
And that is all I want
Let's not forget this is also needed to have decent data retention to allow us to spot problems caused by the v2 roll out as explained in https://community.openfoodnetwork.org/t/making-operations-a-first-class-citizen/1601
That comes with 15-month data retention.
15 days?
First impressions:
We pay $36 for Wormly in Australia. It would increase if we added a lot more hosts for monitoring but probably not as much. They don't bill by host, they bill by how many metrics you are monitoring. Anyway, my conclusion is that those prices are comparable and there is not much in it.
If we talk about an additional $31 for APM, that's very expensive. That would be $155 extra for the five hosts.
Can we select for which hosts we go paid and for which ones we don't or do we then need two accounts?
It's probably enough to get data retention for two or three big instances and maybe APM for one instance. The application behaves pretty much the same on each host. That makes it affordable and should give us all the data we need.
That's another option we could explore in the future. I would keep things agile for now and stick to the paid plan to get longer retention plus the PostgreSQL integration. This would be a lot already as we already have a basic APM with Skylight.
To answer you @Matt-Yorkley
do we need 15-month data retention?
We don't but that's what they offer.
@sauloperez I think in France we are ok to go forward with this :) Let's do it!
Done! Say hello to a wealth of data and more to come!
👆 France production in the last 3 months. I did my best to get some sort of discount but no luck. We'll see as we pay for more hosts in the future 🤞
As explained in-depth in https://community.openfoodnetwork.org/t/making-operations-a-first-class-citizen/1601 we need to start paying for Datadog to get data retention longer than a day plus valuable integrations such as Postgres and Delayed Job, essentials at this point for OFN.