Open zakirullin opened 1 year ago
@zakirullin: Thanks for creating this issue!
That is a great story you shared here. We are tempt to be smart and by being smart we often harm ourselves. We tend to focus on small, irrelevant things, forgetting about the big picture. We do not do proper analysis but we build software on our assumptions.
Wasting time is one thing, if one has the luxury to do so - that is perfectly fine. The complexity that arises from premature optimisation is much worse. The efficiency does not come from air, it does come from the complexity. We as engineers need to keep that in mind.
We could include a few general hints related to optimisation at the end of this section e.g. optimise for correctness first, make sure the efficiency is your real problem, do not guess but do the benchmark.
@erni27 suggested a new section here.
That's a very good idea, actually. We can start sharing our thoughts to see what would emerge.
People overcomplicate things for the sake of optimisation so much time. There's one particular real-world case I would like to highlight.
Often times people optimise things not because the code will be slow, but because they think it would be slow. I.e. the root cause of such a behaviour are big numbers and not-so-good understanding of latency numbers on the low level.
Once a team got a task: "implement a feature for items processing". The business had somewhat 100K+ items in the storage. Simple solution: get those 100K+ items in app's memory, do the job Complex solution: inject a morphology plugin to a storage along side with a lua script, so to execute the code exclusively on storage's side. Thus avoiding both passing the data over the network and loading all the data in app's memory.
The justification for this was: 1) That's too much data to transfer over the network 2) 100K would take a lot of app's memory, and it would be slow to go through all the items
All of these are imagined things, based on poor understanding of low-level things.
The reality was: 1) The storage and app reside in same cluster 2) It takes 10 ns (0.00001 ms) to send 1 byte over the network 3) An item size is ~50 bytes 4) Sending 100,000 items over the network would take 50,000,000 ns (50 milliseconds) 5) It sure not an issue to load 5 mb worth of data in app's memory 6) The business grew at a very slow rate, those 100K items were accumulated over 8 years 7) The feature is used by the admin users once in a while
While the initial 100,000 number may scare developers, the final 50 ms is far less scary. We won't calculate the amount of time we need to loop through all that items, as this is a far less significant number.
Unfortunately, this kind of banal analysis wasn't made, and the complex solution was implemented. The team faced some serious issues with this solution along the way.
Given the business growth rate and all other factors we have optimised for the situation that could potentially occur in ~200 years. So, increased project's complexity will pay off in ~200 years.
The optimised solution profit is in far imagined future, whereas the unnecessary cognitive load is here with us.