federico-busato / Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)
https://federico-busato.github.io/Modern-CPP-Programming/
11.91k stars 798 forks source link

Question in data movement calculation in 21.Optimization_I.pdf #90

Closed xintin closed 5 months ago

xintin commented 6 months ago

In slide 29/62, it states that the data movement in a naive mm is (N^2 . 4B) . 3

I think it should be (N^3 . 4B) . 3

Explanation:

Assuming each floating point is B bytes (usually, B=4 for single precision or B=8 for double precision).

Total data movement in bytes is approximately = 3 . N^3 . B

federico-busato commented 6 months ago

Well, I see your point. On the other hand, this is a more high-level asymptotic analysis. The data movement is intended to be the amount of data accessed from main (slowest) memory. We are looking at an "ideal" system. Actual data movement really depends on the problem sizes and hardware details, such as cache sizes. I'm not aware of any scientific work that provides a rigorous bound on data movement.

xintin commented 6 months ago

I agree with your points. But what I was referring to is something similar to this slide, slides 40-41. And your analysis is similar to slide 39 in the mentioned deck.

I think the figures used in James Demmel's slides are more intuitive.

I'm not aware of any scientific work that provides a rigorous bound on data movement.

If interested, there is excellent work by Hong/Kung, Irony et al., and Ballard on this topic. (obviously, out of scope here).

Lastly, thanks again for this wonderful compilation. This repo is so resourceful.

federico-busato commented 5 months ago

Hi @xintin, thanks again for your suggestions and the references. I'm going to update the related section. Please take a look and let me know if it is clearer now.

xintin commented 5 months ago

Looks great. Thank you @federico-busato