RTimothyEdwards / open_pdks

PDK installer for open-source EDA tools and toolchains. Distributed with setups for the SkyWater 130nm and Global Foundries 180nm open processes.
http://opencircuitdesign.com/open_pdks
Apache License 2.0
265 stars 85 forks source link

mismatch simulations no longer work #315

Open StefanSchippers opened 1 year ago

StefanSchippers commented 1 year ago

After an open_pdks update most pm3 fet model files no longer have the gauss() functions for simulating montecarlo/mismatch. For example, expression for vth0 for a nfet_01v8 transistor (.../share/pdk/sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__nfet_01v8__tt.pm3.spice):

+ vth0 = {0.5190093+sky130_fd_pr__nfet_01v8__vth0_slope_spectre*(sky130_fd_pr__nfet_01v8__vth0_slope/sqrt(l*w*mult))} while expected expression should be like in the expression for sky130_fd_pr__nfet_01v8__tt_leak.pm3: + vth0 = {0.536077+MC_MM_SWITCH*AGAUSS(0,1.0,1)*(sky130_fd_pr__nfet_01v8__vth0_slope/sqrt(l*w*mult))}

RTimothyEdwards commented 1 year ago

I don't think it's anything I did---I haven't touched the scripts that handle that. Did the sky130_fd_pr repository change?

RTimothyEdwards commented 1 year ago

Okay, there have been no changes to the library so I guess I have to conclude that it is something I did after all. . . Checking now.

RTimothyEdwards commented 1 year ago

@StefanSchippers : I can't duplicate this result. I am still getting the correct behavior.

StefanSchippers commented 1 year ago

@RTimothyEdwards Thank you for checking. This is strange since i got this two times after complete reinstalls, that is, after removing completely the installed tree (pdk/) and the source tree (open_pdks_git/). I am now doing a complete install once again.

StefanSchippers commented 1 year ago

No way. After the n-th complete reinstall the file sky130_fd_pr__nfet_01v8__tt.pm3.spice still has no gauss function in vth0:

+ vth0 = {0.5190093+sky130_fd_pr__nfet_01v8__vth0_slope_spectre*(sky130_fd_pr__nfet_01v8__vth0_slope/sqrt(l*w*mult))}

Is there something i should check? may be this is a (yet another) python version incompatibility? Is there any special python lib/package that must be present I should check for? I have Python 3.10.8. I have attached the config.log for reference. config.log

StefanSchippers commented 1 year ago

Attaching also the file sky130/sky130A_make.log. Interesting fact there is a message for file sky130_fd_pr__nfet_01v8__tt_leak.pm3.spice:

...
  Line 48039: Found mismatch parameter 'sky130_fd_pr__nfet_01v8__vth0_slope_spectre' and replaced with 'MC_MM_SWITCH*AGAUSS(0,1.0,1)'.
  Line 48064: Found mismatch parameter 'sky130_fd_pr__nfet_01v8__voff_slope_spectre' and replaced with 'MC_MM_SWITCH*AGAUSS(0,1.0,1)'.
Something was replaced in 'sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__nfet_01v8__tt_leak.pm3.spice'.
...

so in above file the AGAUSS() insertion has been done, while no similar message is shown for sky130_fd_pr__nfet_01v8__tt.pm3.spice. I really can't figure out, the two files have identical structure, however the sky130_fd_pr__nfet_01v8__vth0_slope_spectre pattern is not replaced in the second file. sky130A_make.log

StefanSchippers commented 1 year ago

I see an error in the sky130/sky130A_make.log:

Traceback (most recent call last):
  File "/mnt/sda7/open_pdks_git/sky130/./custom/scripts/mismatch_params.py", line 136, in <module>
    infile = open(infile_name, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__rf_nfet_01v8_bM04W5p00L0p18.spice'

could this lead to a premature end of the mismatch_params.py script, leaving following files unprocessed?

This file is existing:

./sources/sky130-pdk/libraries/sky130_fd_pr/latest/cells/rf_nfet_01v8/sky130_fd_pr__rf_nfet_01v8_bM04W5p00L0p18.spice

Its strange there is no file in the stage area, since this has been done, according to the log file:

Install:/mnt/sda7/open_pdks_git/sources/sky130-pdk/libraries/sky130_fd_pr/latest/cells/rf_nfet_01v8/sky130_fd_pr__rf_nfet_01v8_bM04W5p00L0p18.spice to /mnt/sda7/open_pdks_git/sky130/sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__rf_nfet_01v8_bM04W5p00L0p18.spice

Is there a possibility to run make with no parallel threads? I mean one operation at a time? may be this is a race condition? There are no disk space issues, since I have 75GB free disk available after completing the build.

StefanSchippers commented 1 year ago

After another 'make veryclean', followed by 'make' I have another Traceback in sky130/sky130A_make.log, but this time on a different file:

Traceback (most recent call last):
  File "/mnt/sda7/open_pdks_git/sky130/./custom/scripts/mismatch_params.py", line 136, in <module>
    infile = open(infile_name, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__rf_nfet_01v8_bM04W1p65L0p15.spice'

so looks really some race condition.

RTimothyEdwards commented 1 year ago

You should be able to do make -j 1 (I think is the correct syntax?) to get a single-thread build. Although I'm somewhat skeptical because this happens all within one recipe (primitive-%), where each line of the recipe should be executed sequentially. The only parallelism that the "make" process can do, that I am aware of, is to run different recipes on different threads as long as the Makefile doesn't declare that one recipe is dependent on another one.

StefanSchippers commented 1 year ago

As a followup, running make a second time (without cleaning data) no Traceback errors for missing files are reported and the following make install does all fine. All .pm3.spice files are correctly processed. This fixes for now my issues with mismatch regressions.

StefanSchippers commented 1 year ago

I have also run the very same procedure on another computer, and here i have two Traceback messages: First:

Traceback (most recent call last):
  File "/mnt/sda7/open_pdks_git/sky130/./custom/scripts/mismatch_params.py", line 136, in <module>
    infile = open(infile_name, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__cap_vpp_11p5x11p7_l1m1m2_shieldpom3.spice'

Second:

Traceback (most recent call last):
  File "/mnt/sda7/open_pdks_git/sky130/./custom/scripts/process_params.py", line 61, in <module>
    infile = open(infile_name, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'sky130A/libs.ref/sky130_fd_pr/spice/sky130_fd_pr__cap_vpp_11p5x11p7_m1m2m3_shieldl1.spice'

It really looks like the copy from source to stage is not complete when processing in stage directory begins. As noted before running another time make, followed by make install fixes things.

donnie-j commented 1 year ago

Sorry this will not likely add much of substance, except hand up for also seeing failures...

We are plagued by seemingly random, unexplained python tracebacks also. We have an old, 'golden' machine that can build working PDKs, and minimalist-as-possible scripts https://github.com/j-core/openlane-vhdl-build for building clean machines which don't end up build good PDKs. Both require 'ulimit -n 65536' (see script 04 in that repo) to get any useful output, which none the less seems a red herring for this issue.

The other odd thing is both golden and not create tracebacks, but the degree of breakage on the golden machine usually doesn't affect our usage. If the golden machine -doesn't- create a traceback (that is, if it actually runs everything), the results look good all the way through but magic complains that cell boundaries have changed in the final DRC of a layout containing macros built with the PDK. Said macros themselves passing DRC perfectly fine on their own.

Stefan's report now makes me think this is something like a dependency problem, complete sense since the failing machines are new Ryzen SMP machines (bare metal) and the golden machine is an old, slow single processor AWS instance...

We're looking at it but good for the moment... this will only be a blocker eventually because we will want complete and portable reproducibility.

StefanSchippers commented 1 year ago

@donnie-j this matches with my observations. I started seeing problems on a core i7 laptop after replacing the magnetic HD with a faster SSD drive. Looks like a missing dependency leading to some race condition. Some machines get things right, some other not. I get things right in all cases using the sequence make; make; [sudo] make install instead of the usual make; [sudo] make install.