Create a C++ replica of the Python configuration parser

ngober commented 4 hours ago

The Python-based configuration step has a lot of advantages. It parses very easily, applies defaults in an intuitive way, and it is importable as a module. Its primary disadvantage is that it is a separate step that generates C++ files and compiles multiple executables for trivial changes.

I would like a C++ version that we could compile and link in, to support runtime configuration.

As test cases, the new configuration parser should be able to parse all of the files in test/config/compile-only/ to the same configuration that the Python parser does. More test cases may need to be added, but these are the ones in the documentation, so they are the expected behavior from the users.

ChampSim already leverages the nlohmann-json library for stats printing, so no new dependencies need to be introduced.

maccoymerrell commented 3 hours ago

I both like and dislike this idea. I like the ability to continue using our current configuration files, but I think we should aim higher for what can be done at runtime than just replicating our current script. As a stepping stone I think this is okay, but I also think that the way configuration works currently is too strict in terms of parameter passing.

The more I have worked with ChampSim, the less and less I like the constructors for every component. They are going to be particularly annoying for moving some things into C++. It would be much better to enforce that components have no constructor, and instead must build themselves out upon an initialize() call, or something like that. For runtime construction, this would make our lives a lot easier. We then just need to parse the config file into something that makes parameter lookup easy for each component. I mentioned something like this before in #424.

Runtime construction then wouldn't need to worry about constructing each component, since we can create them without passing any parameters. Connections, references, etc... can occur after everything is instantiated rather than during that process. Then, we can invoke the build process for each component after they are all in existence.

Another potential change that would make this easier is explicit channel declarations and connections in the config file, and removing all implied ones. If we didn't have to worry about inferring connections, processing the config files becomes a lot easier. This is a pretty big change, and probably ruins backwards compatibility, but it does allow us to work towards defining certain parameters for channels such as bandwidth and determining internal queue sizes for channels (if any). Our rq_size and wq_size options in the config are really misnamed, and are probably not doing what people think they are doing. It's not clear because the channels are hidden from the config interface. You could also argue that channels themselves are probably larger objects than what we currently have defined them as. A bundle of queues going in each direction rather than what we have now, but that is probably a different discussion about how we currently use channels and queuing.

ngober commented 3 hours ago

To begin with agreement, I think you're entirely on the right track regarding the queue size options. They're not named well and some work ought to be done to fix that, but I'd like to place it outside the scope of this issue. We can open another one to address that if we want.

I think you misunderstand how the construction of components happens in ChampSim. The environment's constructor does the following:

Instantiate the necessary channels
Instantiate builders, passing the channel endpoints to the builders
Instantiate the components via the builders

This is just what you're suggesting. I strongly suggest we continue to pass information through the constructors. It is the C++ language's mechanism of guaranteeing that every object (here, component) that exists is in a valid state. If we rely on that, we eliminate the possibility of a whole class of "uninitialized value" bugs.

The difference between what I am proposing in this issue is that the unspecified 0th step, currently "generate a C++ file that defines an environment" be converted to "parse the file; pass the parsed configuration to an environment constructor".

I want to settle here for the limited objective of feature parity. If we want to extend or modify the configuration, that should be a separate issue.

ChampSim / ChampSim

Create a C++ replica of the Python configuration parser #549