Open yanggthomas opened 4 months ago
I am using Taichi v1.7.1 and removed dynamic_index parameter in ti.init()
I think the performance depends on geometry (will affect data continuity in memory), configuration of your computer (software and hardware), your library versions etc. I think as these configurations are not the same, so we got different results.
On Wed, 12 Jun 2024 at 11:44, yanggthomas @.***> wrote:
Hi, I am trying to verify the performance on an A100 PCIe version. However, I can't get the expected performance reported in your paper.
Currently I am getting 467 MLUPs for cavity example with 400^3 cubic and 86 MLUPs for non-sparse case for 2phase with 256^3.
Besides, in your paper, it is reported that "The performance of the NVIDIA A100 GPU reached over 900 MLUPS for single-phase flow and 500 for two-phase flow with surface tension." in conclusion section. However in table 1, the max MLUPs for 2-phase is 310. Is this typo or other reasons?
— Reply to this email directly, view it on GitHub https://github.com/yjhp1016/taichi_LBM3D/issues/27, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJEDKQHCV5Q7JZJOQJSQEBLZHARA3AVCNFSM6AAAAABJGC2DWKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DQNBTG42TQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
so what's your suggested best practice to achieve good performance?
I think the performance depends on geometry (will affect data continuity in memory), configuration of your computer (software and hardware), your library versions etc. I think as these configurations are not the same, so we got different results. … On Wed, 12 Jun 2024 at 11:44, yanggthomas @.> wrote: Hi, I am trying to verify the performance on an A100 PCIe version. However, I can't get the expected performance reported in your paper. Currently I am getting 467 MLUPs for cavity example with 400^3 cubic and 86 MLUPs for non-sparse case for 2phase with 256^3. Besides, in your paper, it is reported that "The performance of the NVIDIA A100 GPU reached over 900 MLUPS for single-phase flow and 500 for two-phase flow with surface tension." in conclusion section. However in table 1, the max MLUPs for 2-phase is 310. Is this typo or other reasons? — Reply to this email directly, view it on GitHub <#27>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJEDKQHCV5Q7JZJOQJSQEBLZHARA3AVCNFSM6AAAAABJGC2DWKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DQNBTG42TQMY . You are receiving this because you are subscribed to this thread.Message ID: @.>
Sorry I'm not the best person to answer this question I'm afraid. I'm not a computing export, but a researcher working on CFD algorithm subjects, more on the numerical methods side...
On Wed, 12 Jun 2024 at 14:17, yanggthomas @.***> wrote:
so what's your suggested best practice to achieve good performance?
I think the performance depends on geometry (will affect data continuity in memory), configuration of your computer (software and hardware), your library versions etc. I think as these configurations are not the same, so we got different results. … <#m-6550979183461067865> On Wed, 12 Jun 2024 at 11:44, yanggthomas @.> wrote: Hi, I am trying to verify the performance on an A100 PCIe version. However, I can't get the expected performance reported in your paper. Currently I am getting 467 MLUPs for cavity example with 400^3 cubic and 86 MLUPs for non-sparse case for 2phase with 256^3. Besides, in your paper, it is reported that "The performance of the NVIDIA A100 GPU reached over 900 MLUPS for single-phase flow and 500 for two-phase flow with surface tension." in conclusion section. However in table 1, the max MLUPs for 2-phase is 310. Is this typo or other reasons? — Reply to this email directly, view it on GitHub <#27 https://github.com/yjhp1016/taichi_LBM3D/issues/27>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJEDKQHCV5Q7JZJOQJSQEBLZHARA3AVCNFSM6AAAAABJGC2DWKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DQNBTG42TQMY https://github.com/notifications/unsubscribe-auth/AJEDKQHCV5Q7JZJOQJSQEBLZHARA3AVCNFSM6AAAAABJGC2DWKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DQNBTG42TQMY . You are receiving this because you are subscribed to this thread.Message ID: @.>
— Reply to this email directly, view it on GitHub https://github.com/yjhp1016/taichi_LBM3D/issues/27#issuecomment-2162987875, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJEDKQCZ3PWK7XC3RJOMC73ZHBC7VAVCNFSM6AAAAABJGC2DWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRSHE4DOOBXGU . You are receiving this because you commented.Message ID: @.***>
Hi, I am trying to verify the performance on an A100 PCIe version. However, I can't get the expected performance reported in your paper.
Currently I am getting 467 MLUPs for cavity example with 400^3 cubic and 86 MLUPs for non-sparse case for 2phase with 256^3.
Besides, in your paper, it is reported that "The performance of the NVIDIA A100 GPU reached over 900 MLUPS for single-phase flow and 500 for two-phase flow with surface tension." in conclusion section. However in table 1, the max MLUPs for 2-phase is 310. Is this typo or other reasons?