Closed stevefan1999 closed 2 years ago
It is also reported there is some hardware fluke on KC705.
Other than the problems due to the IP settings?
Somehow this is related to #1798, and I can pass the MRC building with PR #1802, but I should try more rounds of testing first.
Case resolved with #1802
Bug Report
Issue Details
This bug is seen plaguing the build #75623 and build #73252 previously, and it is seemingly repeatable under specific conditions. The bug lead us to here, which is where the unit test stopped in both Hydra executions: https://github.com/m-labs/artiq/blob/4f723e19a6bc9e84c0eb627461fe625007307504/artiq/test/coredevice/test_embedding.py#L90
Indeed, after commenting out that line of code, the test pipeline continues smoothly. But after some close inspections, it seems like this probably isn't the entire story. I have written a Minimal Reproducible Code that replicated this issue pretty consistently.
Minimal reproducible code:
```python3 import numpy from artiq.experiment import * class KernelRoundtripStuck(EnvExperiment): def build(self): self.setattr_device("core") @kernel def roundtrip(self, obj, fn): fn(obj) def run(self): lam = lambda _: None for i in range(1, 5): self.roundtrip(None, lam) self.roundtrip(numpy.array([1, 2, 3], dtype=numpy.int32), lam) self.roundtrip(numpy.array([1.0, 2.0, 3.0]), lam) self.roundtrip(numpy.array(["a", "b", "c"]), lam) self.roundtrip(numpy.array([[1, 2], [3, 4]], dtype=numpy.int32), lam) print("passed") ```Expected Behavior
It should pass no matter what
Actual (undesired) Behavior
It stuck like this in the log below.
Click here to see the UART log
``` [ 36104.746180s] INFO(runtime::session): no connection, starting idle kernel [ 36104.751942s] INFO(runtime::session): no idle kernel found [ 36128.716513s] INFO(runtime::session): new connection fromIt seems like this issue is from numpy.
Your System (omit irrelevant parts)
NixOS 21.11
ARTIQ v7.0-dev
rpi-1
)It is also reported there is some hardware fluke on KC705. Maybe we should take this to other hardwares for sure.
Notice
~Commit up to df6aeb99f6da58dc76deb30f92a32a2bcf1d6589 also succumbed to this~ It seems fine after I rebuilt the gateware, the new problematic commit is 20e079a381b777550ce7872151b17fb4a11cd7ae
7209e6f2798c70333e115d86c631a34f19b20be5 is seemingly a good commit