Open mithro opened 4 years ago
Would you expect each backend class to have reporting methods, or would they be colocated in an edalize.reporting
module?
Would you expect to grab every table in a report, or just focus on a few key tables?
My main use case has been to grab reports from multiple runs to do some comparison and analysis, so my goal has been to get the data into a pandas DataFrame. Is that useful to others or too heavy weight? Since fusesoc already uses pyparsing that could be another approach.
Tables that have been picked out of the report like in your example should be able to be handled by pandas.read_table
. However, it seems to make this sort of table difficult. For example, I don't think read_table
handles leading and trailing separators well, so those will show up as NaN
columns that would then need to be dropped. For a Vivado "Slice Logic" table:
>>> print(table_str)
+----------------------------+------+-------+-----------+-------+
| Site Type | Used | Fixed | Available | Util% |
+----------------------------+------+-------+-----------+-------+
| Slice LUTs | 1009 | 0 | 133800 | 0.75 |
| LUT as Logic | 954 | 0 | 133800 | 0.71 |
| LUT as Memory | 55 | 0 | 46200 | 0.12 |
| LUT as Distributed RAM | 48 | 0 | | |
| LUT as Shift Register | 7 | 0 | | |
| Slice Registers | 920 | 0 | 267600 | 0.34 |
| Register as Flip Flop | 920 | 0 | 267600 | 0.34 |
| Register as Latch | 0 | 0 | 267600 | 0.00 |
| F7 Muxes | 0 | 0 | 66900 | 0.00 |
| F8 Muxes | 0 | 0 | 33450 | 0.00 |
+----------------------------+------+-------+-----------+-------+
>>> df = pd.read_table(io.StringIO(table_str), delimiter="|", comment="+", index_col=1).dropna(axis="columns")
>>> print(df)
Used Fixed Available Util%
Site Type
Slice LUTs 1009 0 133800 0.75
LUT as Logic 954 0 133800 0.71
LUT as Memory 55 0 46200 0.12
LUT as Distributed RAM 48 0
LUT as Shift Register 7 0
Slice Registers 920 0 267600 0.34
Register as Flip Flop 920 0 267600 0.34
Register as Latch 0 0 267600 0.00
F7 Muxes 0 0 66900 0.00
F8 Muxes 0 0 33450 0.00
>>>
I hadn't seen asciitable
or it's successor astropy.io.ascii
, but they may be more robust.
@GCHQDeveloper560 - I actually think both make sense.
We should support some standard + common tables. We should also support extracting as much vendor specific information as we can.
It might be nice to have a common device-neutral table with some basic information, but is it going to be too painful to map device-specific resources to a common format?
For example "LUTs", Registers", "Block Memories", and "DSPs" might make sense, but then the Quartus code would need to know a lot about what table cells to map to each bucket for particular devices. For example, it would need to know for the Cyclone 4 that "Block Memories" = M9Ks and "DSPs" = DSP 9x9 + DSP 18x18. Is trying to sort resources into buckets like this a bad idea and this common table should just include the device-specific resource names?
Timing information may be easier as something like "Clock Constraint (MHz)" and "FMax (MHz)" should be universal, at least for single clock designs.
@GCHQDeveloper560 - Just starting with listing the number and usage of the devices inside an FPGA (and not trying to map to specific groups) seems like a good start?
There's an initial shot at reporting resource and timing information for ISE, Vivado, and Quartus, at least for some devices, at my fork.
In addition to the detailed device-specific information from report_resources
and report_timing
I took a shot at a common summary with report_summary
. All three are combined in the report
method. The general approach was to grab the ASCII tables from the reports and convert them to Pandas DataFrames to allow easy querying and manipulation of the results. Hopefully this setup will support adding additional information in the future by better parsing the reports. For example, reporting worst paths from a timing report, or doing something logical with cores that have multiple clocks or other timing constraints.
Given olofk/fusesoc#337 I made no effort to support Python 2.
Thanks for any feedback!
@GCHQDeveloper560 - I haven't looked at your code yet but wanted to mention that @kgugala did some similar work here in https://github.com/SymbiFlow/fpga-tool-perf
@kgugala Discovered this really cool python module called "asciitable" @ http://docs.astropy.org/en/latest/io/ascii/index.html which is super helpful for parsing the tables out of Vivado output.
@kgugala Can you work with @GCHQDeveloper560 to get them upstream?
Hi,
Just wanted to chime in and say that this is something I would love to see. Since it's not a critical feature we can just start pulling things in when they seem usable and iterate on the design
@mithro, @kgugala:
I appreciated your original pointer to SymbiFlow/fpga-tool-perf. It's where I'd gotten the "extract all the tables" approach versus my previous report parsing that had just gone in and picked out a few specific bits of information.
I had hoped to just be able to feed the raw ASCII tables to Pandas or asciitable/astropy.io.ascii without having to do a bunch of cleanup. The astropy fixed_width
format seemed to do a better job than pandas.read_csv
and friends with the leading and trailing separators that many of the tables use. However, it like Pandas didn't seem to handle a case from ISE where the table used a multi-line header. Another complication was tables without a header row. Both Pandas and asciitable seemed unable to automatically detect this.
So, the code currently has it's own CSV conversion routine that works for most cases, but I'm certainly open to replacing it with something better.
I certainly welcome additional tool support or other improvements!
@olofk: I agree this is likely to take multiple iterations to get right. Pulling it in as a beta feature without a backward compatibility guarantee sounds like the way to go.
I've updated my branch to add a more complex test case with multiple clocks and fixes for the bugs and needed cleanups that it pointed out. I'd appreciate any feedback.
I don't have any other planned updates, so if this looks reasonable I'll submit a PR with the caveats from the earlier discussion that wider testing will almost certainly reveal more changes needed.
I've added some Python 3.5 clean-ups to the branch.
I also realised I need to account for new packages in setup.py
and tox.ini
. Unless I've missed one I think PyParsing, NumPy, and Pandas are the additions. FuseSoC already uses PyParsing, so adding it to install_requires
is probably not a big deal. However, I'm guessing Pandas and friends might be more of a burden for those not using the reporting features.
Is adding these packages to install_requires
okay, or would we rather make reporting an extra and put them in extras_require
?
Thanks for your work on this. Can you open a pull request once you have code ready to be reviewed? It's significantly easier to see and discuss there than in a branch.
On the dependencies: numpy and pandas are very heavy dependencies, and I'd like to understand better why they are necessary, and how we can potentially avoid that. That'll probably become more obvious when looking at the code in question, so a PR is appreciated.
Here is a simple example for Vivado.
https://github.com/SymbiFlow/fpga-tool-perf/blob/f8438d0447ae808fd1434073b037fa30b274c423/fpgaperf.py#L444-L506