Open arakhmati opened 7 months ago
@arakhmati is this blocking performant Bloom?
@jvasilje It's not blocking bloom for question answering but it does block bloom for causal lm which we aren't measuring right now
@DrJessop the error the we see in bloom is:
E RuntimeError: TT_ASSERT @ tt_metal/impl/dispatch/command_queue.cpp:755: read_buffer_command_size <= DeviceCommand::HUGE_PAGE_SIZE - CQ_START
E info:
E EnqueueReadBuffer command is too large
To reproduce:
arakhmati/large-tensor
pytest "tests/ttnn/integration_tests/bloom/test_bloom_for_causal_lm.py::test_ttnn_bloom_for_causal_lm"
Two things
@davorchap @jvasilje I believe 2 to be much higher priority than 1, and I believe Moreh also wanted the ability to write/read given some offsets into/out of a buffer.
comment: this will be a blocker for Bloom in LLM mode, but not for Q&A Bloom variant.
@DrJessop splitting a user buffer into multiple EnqeueReadBuffer commands under hood -- is this the feature we need to unblock?
@davorchap @abhullar-tt actually supports this in her completion queue PR.
@davorchap @abhullar-tt actually supports this in her completion queue PR.
@abhullar-tt this would be great, let us know if your changes make this pass
The test runs but the expected test is not the same as generated text:
> assert expected_generated_text == generated_text
E assert 'Hello, my dog is cute and sweet. He loves to play with me and' == 'Hello, my dog is cute.\nong song"\n\n"?"?'
E + Hello, my dog is cute and sweet. He loves to play with me and
E - Hello, my dog is cute.
E - ong song"
E -
E - "?"?
The test runs but the expected test is not the same as generated text:
> assert expected_generated_text == generated_text E assert 'Hello, my dog is cute and sweet. He loves to play with me and' == 'Hello, my dog is cute.\nong song"\n\n"?"?' E + Hello, my dog is cute and sweet. He loves to play with me and E - Hello, my dog is cute. E - ong song" E - E - "?"?
I added a standalone unit test for the large matmul (on the same branch):
pytest "tests/ttnn/unit_tests/test_matmul.py::test_matmul_with_large_n"
pytest "tests/ttnn/unit_tests/test_matmul.py::test_matmul_with_large_n"
I rebased the branch and pushed earlier, i don't think the test was included. sorry do you mind pushing it again
```shell pytest "tests/ttnn/unit_tests/test_matmul.py::test_matmul_with_large_n"
I rebased the branch and pushed earlier, i don't think the test was included. sorry do you mind pushing it again
It's on the branch. Can you run git pull --rebase
? And/or do a hard reset
pytest "tests/ttnn/unit_tests/test_matmul.py::test_matmul_with_large_n"
ah this test fails because the page size (501760 B) is too large to fit into the dispatch core CB (440320 B)
@arakhmati , is there a current use case that are blocked by this?
As far as I know, this is only a problem in ttnn implementation of Bloom model for CausalLM application. I don't think there are other use cases that we have encountered
Given the limited use case, a bloom being less of a use case, i think we can downgrade from P1 to P3. Do you think differently?
Given the limited use case, a bloom being less of a use case, i think we can downgrade from P1 to P3. Do you think differently?
Yes, we can downgrade to P3
Unable to move large tensors to device. In other cases if the large tensors get created from operations we are unable to move them to host.
Large tensor moving to host with ttl_tensor.cpu causes...