verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
979 stars 378 forks source link

add pin counting filter option to pack_mol_in_existing_cluster #2585

Closed KA7E closed 2 weeks ago

KA7E commented 3 weeks ago

Added enable_pin_feasibility_filter parameter to pack_mol_in_existing_cluster

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

AlexandreSinger commented 3 weeks ago
image

Thats not a good error to see. It looks like our self-hosted machines are not having a great time. One issue for this @KA7E is that the CI is run on EVERY push AND PR raised on a branch. If you are not careful you may be running the CI way more often than you think. This appears to be causing strain on the machines. We may have to wait for the CI to calm down a bit.

I am currently working on getting having the CI get called less and save less information so this is more unlikely to happen. But for now, lets get the other PRs merged in.

AlexandreSinger commented 3 weeks ago
image

This error pretty clearly has nothing to do with your changes (since this actually ran the CI twice and one of them passed). @vaughnbetz , I wonder if this is caused by the artifacts thing we were talking about earlier this week.

Like this does not look good LOL:

image

But granted I have always seen it over, but usually its only 30 GB not 140 GB. However, reading into it this is actually from CCache not from artifacts:

image

Either way I will look into how we are storing caches and think about how we can reduce it, since it looks like we are storing 400 MB of data per cache which feels like a lot.

Regarding the artifacts thing though, this is either a fluke; or we are having issues with how much artifacts we can store. @vaughnbetz maybe we can follow the steps to reduce the storage from 90 days to 45 days.

KA7E commented 3 weeks ago

@AlexandreSinger and @vaughnbetz is there a way for this to progress despite the CI issues? This is less crucial to the legalizer than the other two, but still important.

AlexandreSinger commented 3 weeks ago

@KA7E I have merged in master, which contains my fix to the CI. I wanted to use this branch as a guinea pig, since every time you push it actually used to run the CI twice. Now with these changes it only runs the CI once. Mission success lol.

The CI is much calmer this morning (especially after I killed a bunch of duplicate runs), so hopefully this should pass without issue. The CI should become much more stable as more people merge in my recent PR.

vaughnbetz commented 2 weeks ago

Looks good. For safety, I'd like to get at least one design (e.g. neuron) with a CPU time number before and after the various related PR changes to confirm CPU time in the default flow is unaffected.
That could be done with a QoR sweep but a simpler test that would also work would just be to run one design on a machine with controlled load (e.g. your machine) and post the pack & place & total run times, along with post-route cpd and wirelength. I don't expect any changes, but it pays to be paranoid as it is very, very hard to find a degradation that slips into the master branch.