Closed chaufe closed 1 year ago
@chaufe : How do I run this example?
@RTimothyEdwards You could either upload the unzipped test case into https://colab.research.google.com, or directly follow this link to load the testcase from my github repo into Google Colab. https://colab.research.google.com/github/chaufe/testcases/blob/main/colab/magic___buffer_overflow_when_merging_in_GF180MCU_SRAM_via_mag_file.ipynb Use Ctrl-F9 or the menu bar to run the all cells of the test case. The third-last cell named Merge GDS (trigger buffer overflow) will contain the buffer overflow. At the left of the colab window, there are options to download files.
@chaufe : Try using the command locking off
at the top of your script. The GDS of the GF I/O cells has so many nested subcells that it exceeds the typical Linux ulimit on open file descriptors. Once that happens, random bad things occur.
I added this line in the colab and it seems to have worked.
FYI, the purpose of file locking is for the management of layout on a filesystem where a layout could potentially be edited by multiple people at the same time. Since that doesn't apply to colabs, any colab script running magic may disable file locking. The problem doesn't show up often, though, and I've only seen it occur when using the GF I/O cells (the ulimit is typically set to 1024, so you have to have a lot of cells in the layout---that's unique cells, not instances). I'm not sure why changing the source of the SRAM layout makes the problem go away, but I assume that reading the SRAM GDS is just holding the one GDS file open, while reading the .mag files of the SRAM is holding a lot of both .mag and GDS files open, which on top of the I/O cells being held open, is exceeding the ulimit.
@RTimothyEdwards: Thanks for your reply.
I don't think it is the file locking limit. I've added locking off
to the beginning of file def2gds.buffer_overflow.mag
. When running magic using this script, I still see *** buffer overflow detected ***: terminated
during gds write
.
I've also inlcuded ulimit -a
, which reports file locks (-x) unlimited
.
The failing part of the test case does not read-in any GDS directly, just .mag files via the addpath
command.
I've updated the test case to more recent builds of magic/gf180mcuC/openroad in-place:
https://colab.research.google.com/github/chaufe/testcases/blob/main/colab/magic___buffer_overflow_when_merging_in_GF180MCU_SRAM_via_mag_file.ipynb
@chaufe : How can I download the DEF file generated by the colab?
Poking around on the web, it seems that the "buffer overflow detected" error does relate to open file descriptors. I know that "ulimit -a" on all the systems I've ever used gives a useless answer. I also get "unlimited" but I know there's a hard limit of 1024 set in the kernel and I've never found that any of the standard cited methods to remove that limit actually work.
Possibly what's going on has nothing to do with file locking, but that the GDS writing routine is opening files in a nested routine that is ending up with too many open files at the same time. If so, then I should be able to reproduce that on my local machine reading the DEF file, since I know my home desktop has the ulimit issue. That would give me more flexibility for debugging it and figuring out if I can refactor the code so that it doesn't keep so many file descriptors open. Assuming that's the problem.
At the left of colab, there is a little "folder" symbol. Clicking on it will turn it orange (see picture) and reveal the file system of the colab env, enabling file download using the right mouse button.
The DEF is located here: runs/RUN*/results/placement/testcase.def
Here is copy of testcase.def.gz from my run in colab.
@chaufe : Solved. There was one place in the code that the flag set by "locking disable" was not checked, so files were not closed, causing the open file descriptor overflow. I fixed the problem in magic version 8.3.415 (on opencircuitdesign.com; github will get updated in 12 hours).
Thank you for the test case!
BTW, I recommend that you run the command cif *hier write disable
before running gds write
. In the case of a top-level assembly from standard cells and standard I/O, there should be no unexpected hierarchical interactions between cells (because all cells have been purposefully designed to abut cleanly). Disabling magic's automatic check for hierarchical interactions will greatly speed up the writing of the GDS file.
Thanks for the fix. The buffer overflow is also gone at may side, so I close the issue.
Regarding the cif *hier write disable
, I prefer not having this flag set just to be on the safe side.
(Assume a faulty stdcell design, or a stdcell library with cell spacing constraints that have not been satisfied by the placer.)
That's certainly the safe assumption. For large digital designs, though, it can add a hefty overhead to the GDS output processing.
Please find attached a test case ipython notebook that can be run in Google Colab. magic___buffer_overflow_when_merging_in_GF180MCU_SRAM_via_mag_file.ipynb.gz
In this test case, a simplistic design is implemented containing a pad ring and an SRAM macro, all based on GF180MCU.
To merge the GDS of the testcase, I have attached two version, one creating a buffer overflow inside magic, and one which doesn't.
To create the buffer overflow, no GDS file is read-in but
addpath
is being used:lef read $::env(CONDA_PREFIX)/share/pdk/gf180mcuC/libs.ref/gf180mcu_fd_sc_mcu7t5v0/techlef/gf180mcu_fd_sc_mcu7t5v0__nom.tlef
addpath $::env(CONDA_PREFIX)/share/pdk/gf180mcuC/libs.ref/gf180mcu_fd_sc_mcu7t5v0/mag/
addpath $::env(SRAM_LIB_PATH)/mag/
def read $::env(IN_DEF)
gds write $::env(IN_DEF).buffer_overflow.gds
Upon GDS write,
magic
exists with a broken output GDS file:[...]
Generating output for cell gf180mcu_fd_sc_mcu7t5v0__dffq_2
Generating output for cell testcase
*** buffer overflow detected ***: terminated
To work-around the buffer overflow, I have changed the commands to use the SRAM GDS file rather than the MAG file:
#addpath $::env(SRAM_LIB_PATH)/mag/
gds read $::env(SRAM_LIB_PATH)/gds/gf180mcu_fd_ip_sram__sram512x8m8wm1.gds
This version does not trigger a buffer overflow.
/cc @proppy /cc @dhaentz1