f4pga / f4pga-arch-defs

FOSS architecture definitions of FPGA hardware useful for doing PnR device generation.
https://f4pga.org
ISC License
272 stars 113 forks source link

"HDDMProto::writeMessage failed" from Vivado #1527

Closed tcal-x closed 4 years ago

tcal-x commented 4 years ago

This may be related to Issue #1400.

This is seen in testing for PR #1515 , the xc7_vendor tests.

I see the Issue #1400 failure for test bram_sdp_test_36 for Basys3.

Later I see other failures under tests/9-soc and tests/9-scalable_proc, all for Basys3, 50T part. My changes for 100T should not be affecting them.

There is one error like this:

ERROR: [Common 17-1294] Unable to create directory [/tmpfs/src/github/symbiflow-arch-defs-presubmit-xc7-vendor/build/tests/9-scalable_proc/top_bram_n3/artix7-xc7a50t-basys3-roi-virt-xc7a50t-basys3-test/.Xil/Vivado-22251-kokoro-gcp-ubuntu-prod-1092307855/dcp1].

This may be general failure reporting, not actually a problem creating the folder.

There are multiple errors like this:

ERROR: [Common 17-49] Internal Data Exception: HDDMProto::writeMessage failed

These messages seem to be related to the following tests, although I'm not 100% sure due to the interleaving of messages from parallel builds:

9-scalable_proc/top_bram36_n3
9-scalable_proc/top_dram_n3
9-soc/murax

When I run locally on my branch staging100T, the *_vivado_diff_fasm tests are all correct, except for 9-soc/murax, which shows the #1400 issue.

litghost commented 4 years ago

Unable to create directory

Is it possible the disk is full?

tcal-x commented 4 years ago

Unable to create directory

Is it possible the disk is full?

I don't know how to prove or disprove that hypothesis. It happened to me twice, but I guess if the test is allocated a private, fixed-size disk each time, then it would repeat the same behavior.

In case this link is useful, https://pantheon.corp.google.com/storage/browser/symbiflow-arch-defs/artifacts/prod/foss-fpga-tools/symbiflow-arch-defs/presubmit/xc7_vendor/1028/20200614-225509

pantheon.corp.google.com - MOMA Single Sign On
litghost commented 4 years ago

I don't know how to prove or disprove that hypothesis. It happened to me twice, but I guess if the test is allocated a private, fixed-size disk each time, then it would repeat the same behavior.

From a random googling: https://forums.xilinx.com/t5/Implementation/ERROR-Common-17-49-Internal-Data-Exception-HDDMProto-readMessage/td-p/788832

That comment supports the hypothesis. In terms of verifying it in kokoro, you can add a "df -h" before and after vivado invocations?

ERROR: [Common 17-49] Internal Data Exception: HDDMProto::readMessage failed
My name is Takashi Hayakawa.   I got the following error during Implementation - place design.   [Place 30-99] Placer failed with error: 'ERROR: [Common 17-49] Internal Data Exception: HDDMProto::readMessage failed ' Vivado version : 2017.2 OS:Windows 7 Pro. SP1 (64bit) I use HDMI tx/rx IP with the ...
tcal-x commented 4 years ago

Thanks Keith,

Where do I access the hooks for changing kokoro behavior?

My Googling on the error message was inconclusive...it seemed like there was some internal bug, since they said "this has been resolved in 17.3". But maybe they just meant they improved their messaging to something useful like "Disk Full.".

litghost commented 4 years ago

It's all done via sh scripts: https://github.com/SymbiFlow/symbiflow-arch-defs/tree/master/.github/kokoro

You could also modify what CMake invokes during the build.

GitHub
SymbiFlow/symbiflow-arch-defs
FOSS architecture definitions of FPGA hardware useful for doing PnR device generation. - SymbiFlow/symbiflow-arch-defs
tcal-x commented 4 years ago

Updates: as with the last run, I now don't see the Vivado errors nor "cannot create folder".

df shows that I'm using less than 1% of the available space -- 28GB out of 4TB.

I'll revert the changes I made to xc7-vendor.sh. This is how the ninja line looked with my changes: ninja -k0 -j${MAX_CORES} all_xc7_diff_fasm || ( echo TCAL ; df -h . ; echo du: ; du -d 2 ; exit 141 )