srush / Triton-Puzzles

Puzzles for learning Triton
Apache License 2.0
925 stars 57 forks source link

Fix typo in Puzzle 9: Simple FlashAttention #14

Closed pchng closed 3 months ago

pchng commented 3 months ago

The question references B1 when it should reference B0, since there is no B1 for this puzzle.

pchng commented 3 months ago

Also I wonder if B0 shouldn't be set to something like 32 instead of 200 (In the call to test() in the code cell) since the puzzle does specific B0 < T and typically the block size (as I understand it) is supposed to be a power of 2? Let me know, I can also make this change.

edit: I think this is wrong since setting to 32 causes the program_id to vary. Apologies.

pchng commented 3 months ago

NVM - I think I understand now. The intention (since "Simple") is to process the entire input (in a single loop iteration) since B0 == T == 200 (which I have gotten to work)

If B0 is less than T then the solution becomes more complicated and probably requires a tiling approach.