lip6 / coriolis

Coriolis VLSI EDA Tool (LIP6)
https://coriolis.lip6.fr
GNU General Public License v2.0
46 stars 5 forks source link

Coriolis overwrites .vst input files #82

Open Coloquinte opened 10 months ago

Coloquinte commented 10 months ago

It seems that the order of iteration of getTerminalNetlistInstanceOccurrences can vary between similar builds.

In my case, using Meltemi instead of Etesian changes cell order in Etesian (branches determinism_etesian and determinism_meltemi). The code exports etesian.nets and etesian.nodes files, which differ only by the ordering of the cells. Both tools should do exactly the same thing, as Meltemi is derived from Etesian.

If this is due to different memory layouts, this may be a symptom of a determinism issue.

robtaylor commented 10 months ago

That is worrying. Is the order consistent between runs?

Might be worth putting together a determinism test. Comparing output on the different platforms would also make sense.

On Fri, 3 Nov 2023 at 10:05, Gabriel Gouvine @.***> wrote:

It seems that the order of iteration of getTerminalNetlistInstanceOccurrences can vary between similar builds.

In my case, using Meltemi instead of Etesian changes cell order in Etesian (branches determinism_etesian and determinism_meltemi). The code exports etesian.nets and etesian.nodes files, which differ only by the ordering of the cells. Both tools should do exactly the same thing, as Meltemi is derived from Etesian.

If this is due to different memory layouts, this may be a symptom of a determinism issue.

— Reply to this email directly, view it on GitHub https://github.com/lip6/coriolis/issues/82, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4O4CVYKMV4X5TRPHZYM3YCS6X3AVCNFSM6AAAAAA64FZWTSVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3TKOBTGI2TIMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jpc-lip6 commented 10 months ago

In case you didn't do it yet, in your output files display the ids of the various objects. The ids can be viewed as an order tag, or a time stamp. So first thing is to check that they are identical. If so, we can rule out a loading order difference. If not, then it may be the InstrusiveMap keys or the Collection::Locator that have a determinism issue, but they have been thoroughly tested for that.

Coloquinte commented 10 months ago

Yes, the order is consistent between runs, so it may well be due to other setups done by the tools at startup time, or something I missed regarding configuration.

jpc-lip6 commented 10 months ago

If you have a very small design to test on, you can activate the debug level 0, this will dump information on how the database itself is constructed. How the keys are computed, in which bucket they are put. It will generate a tons of log, but you can perform a diff on the logs afterwards to see at which point they start to differs. You may use head and tail to compare one slice of the log at a time to speed up the comparison. diff will be fast, even on big files if there are little differences. If it takes too long, start to slice... And gvimdiff is quite convenient to see the results.

jpc-lip6 commented 10 months ago

That is worrying. Is the order consistent between runs? Might be worth putting together a determinism test. Comparing output on the different platforms would also make sense. On Fri, 3 Nov 2023 at 10:05, Gabriel Gouvine @.> wrote: It seems that the order of iteration of getTerminalNetlistInstanceOccurrences can vary between similar builds. In my case, using Meltemi instead of Etesian changes cell order in Etesian (branches determinism_etesian and determinism_meltemi). The code exports etesian.nets and etesian.nodes files, which differ only by the ordering of the cells. Both tools should do exactly the same thing, as Meltemi is derived from Etesian. If this is due to different memory layouts, this may be a symptom of a determinism issue. — Reply to this email directly, view it on GitHub <#82>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4O4CVYKMV4X5TRPHZYM3YCS6X3AVCNFSM6AAAAAA64FZWTSVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3TKOBTGI2TIMY . You are receiving this because you are subscribed to this thread.Message ID: @.>

Yes. I agree. I did it in a limited way once upon a time between ScientificLinux and Debian, it was working for the limited tests I did. But being more systematic would be a boon.

Coloquinte commented 9 months ago

The origin of the bug is not a non-determinism, but the fact that Coriolis overwrites its input file!! So running it multiple times causes different results to be observed.

Not too big of a deal for my experiments, but very error prone whenever we have a Makefile-like flow: we need to cleanup the vst files to get the same result twice.

I think we should have different input and output file names here.