chipsalliance / firrtl-spec

The specification for the FIRRTL language
38 stars 27 forks source link

[major] Add `fd` operand to `printf` statement #213

Open SpriteOvO opened 3 months ago

SpriteOvO commented 3 months ago

This PR proposes to add a new operand fd to printf statement, which will allow us to fopen a file and write formatted strings to it in the future. (fopen and fclose will be implemented as intrinsics latter)

seldridge commented 3 months ago

This is getting pretty Verilog-specific which I don't really like. Specifically, in order for this to work, it likely also needs an fopen/fclose which returns an ID and that is then used. Otherwise, this is only usable for writing to stdout/stderr.

Is the motivation for this to write to specific files or to select between stdout/stderr?

If the motivation is to write to different files, then I'd prefer a Chisel/post-processing solution:

  1. Chisel provides common infrastructure for logging. E.g., log-levels, common formats, inclusion of %m to indicate what module something was in.
  2. There is a separate tool that can filter the raw log to extract specific information.

The end result of this is the same as writing to different files. However, it avoids having to introduce a real notion of a file descriptor in FIRRTL.

SpriteOvO commented 3 months ago

Is the motivation for this to write to specific files or to select between stdout/stderr?

The fundamental motivation for this is to enable the SystemVerilog generated by Chisel&CIRCT can be fwrite to different files by the simulator. The further goal is to allow Chisel to automatically split files by layer.

Specifically, in order for this to work, it likely also needs an fopen/fclose which returns an ID and that is then used. Otherwise, this is only usable for writing to stdout/stderr.

Sure, I'm working on adding fopen and fclose as FIRRTL intrinsics in CIRCT, and I should have a PR open in CIRCT repo in the next few days.

If the motivation is to write to different files, then I'd prefer a Chisel/post-processing solution:

  1. Chisel provides common infrastructure for logging. E.g., log-levels, common formats, inclusion of %m to indicate what module something was in.

  2. There is a separate tool that can filter the raw log to extract specific information.

The end result of this is the same as writing to different files. However, it avoids having to introduce a real notion of a file descriptor in FIRRTL.

If I understand correctly, printf/fwrite will be handled by the simulator, while Chisel post-processing can only output on elaboration-time anyway.

sequencer commented 3 months ago

The reason of post processing is hard is we also want write a giant log into file while keeping stdout clean for other usages e.g. uart.

seldridge commented 3 months ago

The reason of post processing is hard is we also want write a giant log into file while keeping stdout clean for other usages e.g. uart.

Thinking about this... Would it be acceptable to pipe all of the output (stdout, stderr, or both) into a script which then does the splitting for you, while leaving some things on stdout/stderr? Without writing the script:

$ ./run-verilator | slice-outputs
UART
UART
UART
$ ls
info.log warn.log debug.log layer-1.log layer2.log
sequencer commented 3 months ago

Thinking about this... Would it be acceptable to pipe all of the output (stdout, stderr, or both) into a script which then does the splitting for you, while leaving some things on stdout/stderr? Without writing the script:

This is doable, but for a user experience, this is a bad design that each user need to writing their own logging script...

seldridge commented 3 months ago

If we did do this, then it's reasonable to both build the logging library in Chisel and provide the script for working with this in Chisel as well. 🤓

sequencer commented 3 months ago

If we did do this, then it's reasonable to both build the logging library in Chisel and provide the script for working with this in Chisel as well. 🤓

I do agree, after getting fd into Chisel, I think Chisel can provide a slim package chisel3.logging to provide a logging functionality:) It should be able to log to machine readable Json like where we did in T1

Since Chisel doesn't support UVM-based Scoreboard functionality, the logging package can be used as a utility to do the offline checking. e.g. Trace event of each core, doing check after simulation is done(not at simulation runtime or UVM check_phase).

seldridge commented 3 months ago

Err... I'm saying that I'm not convinced of the need for the file descriptor and think the logging can be packaged up with a library in Chisel and a script to post-process it.

Alternatively, a file descriptor is something that is circuit-global and is said to be open for the duration of the simulation/evaluation of a circuit.

WDYT?

@darthscsi: WDYT?

sequencer commented 3 months ago

I think the situation is the stdout is not only be used by chisel, but also be used by other programs on the supply chain, e.g. UVM, C libraries. We cannot force them to align with a same pattern of logging. While delivering RTL as an IP, if customers want integrate the RTL to their own verification environment, delivery of a program to process the stdout is also an anti-pattern methodology.

seldridge commented 3 months ago

Going off of what I was suggesting above... Something like the following may make more sense:

circuit Foo:
  file Bar, "Bar.log", read
  public module Foo:
    input clock: Clock
    input a: UInt<1>

    printf(clock, UInt<1>(1), Bar, "hello world")

Alternatively, just putting the filename in the printf may be fine and it's up to the lowering to figure out what file descriptors to create:

printf(clock, UInt<1>(1), file="Bar.log", "hello world")

Basically, what is the right way to get a printf to a specific file? Can we avoid the need for low-level file descriptors?

sequencer commented 3 months ago
circuit Foo:
  file Bar, "Bar.log", read
  public module Foo:
    input clock: Clock
    input a: UInt<1>

    printf(clock, UInt<1>(1), Bar, "hello world")

I thought about it, maybe the file handler should for each public module or instance?

Basically, what is the right way to get a printf to a specific file? Can we avoid the need for low-level file descriptors?

I think this could be a good solution, making it a firtool emit option. This can be minimal overhead and being filesystem agnostic, however the problem is emission to a same file may be a problem for the simulator threading. Maybe we can have three options:

WDYT?

uenoku commented 3 months ago

There is a limitation for the number of the files that a single process can open (ulimit -n) so file per instance approach may not work practically. So it may be possible that users want to close files during the simulation. That said I agree that such resource management is super low-level and it's reasonable to use circuit global file descriptors and not to allow manual fclose.

darthscsi commented 3 months ago

I'm actually quite amenable to this provided: 1) FD can be generated in the spec. You can't have to go to intrinsics to open/close an FD. 2) all ops which produce messages can use FDs. Particularly, the weirdness with messages on asserts means they should be able to be directed to FDs. I'd rather that capability be removed from those ops (write the printfs you want for the asserts yourself), but if we aren't doing that, we should be consistent. 3) A sane Chisel-level interface for this exists. 4) FDs should be a type not usable in most ops (no addition, not on ports, etc). 5) They are explicit about their interaction with Verilog FDs, C FDs, and platform FDs (e.g. they should not be a new thing from those, just a representation in FIRRTL of the underlying FDs)

Reasoning: This is a very common piece of functionality for a reason and requested often for those reasons. Post-processing STDOUT is a hack which introduces tool-dependent filtering, imprecision, and unnecessary indirection and overhead. FIRRTL is a fairly low-level IR anyway, so I don't buy the "fds are too low level". The better argument is "fds are too simulationy". This capability matches every other language's mechanism. Interoperability with C is highly desirous, especially as DPI is rolled out. FD as an abstraction is platform agnostic (due mostly to posix and C), even on windows this is fine.

Concerns: Execution order, having a way to have global file handles (not reopening in each module, potential buffering issues there), Mixing FD with other data-types, how to get file handles to all the places which need them without passing them in ports.