cms-L1TK / project_generation_scripts

Python scripts to generate the wiring map of the tracklet pattern recognition & the top-level HDL that calls the HLS modules in the Hybrid Chain.
1 stars 2 forks source link

Test regions of modules #15

Closed fatimayousuf closed 3 years ago

fatimayousuf commented 3 years ago

Right now, we can run unit tests on a single module (i.e test the top level code for PR_L3PHIC). This PR gives the option to run tests on groups of modules (i.e groups of Projection Routers or groups of Match Calculators) as well as the modules upstream or downstream in the data flow.

tomalin commented 3 years ago

Nice option. Please ensure your new "--mut" option is documented in README.md

fatimayousuf commented 3 years ago

in generator_hdl.py is there any feedback to the user if the get_all_module_units returns None? It looks like it will just fail silently unless the verbosity is turned up.

Well, there isn't any similar type of feedback given to the --uut argument either, so I didn't think it was needed. Since the --mut argument only accepts certain arguments (i.e PR, VMR), get_all_module_units shouldn't return None unless something was wrong with the wires.dat file.

pwittich commented 3 years ago

in generator_hdl.py is there any feedback to the user if the get_all_module_units returns None? It looks like it will just fail silently unless the verbosity is turned up.

Well, there isn't any similar type of feedback given to the --uut argument either, so I didn't think it was needed. Since the --mut argument only accepts certain arguments (i.e PR, VMR), get_all_module_units shouldn't return None unless something was wrong with the wires.dat file.

Maybe the right thing is to add some feedback in both cases if the arguments fail (rather than just copying the other behavior.) A simple printout would probably be worth a lot.

pwittich commented 3 years ago

This would be a great option, but I'm worried that as the code is set up now, the top-level vhdl generate will be incorrect in some cases. For example, PR_L3PHIC is connected to 8 MEs (ME_L3PHIC 17->24). If we try to make all PR-ME chains by using the --mut == ME option and setting -u 1, -d 0, the scripts will generate 8 copies of the each of VMProj_L3PHIC 17-24, instead of one of each copy. Also, these VMProj memories will all be flagged as initial memories, because for some of the chains, these memories will need to be populated by the test bench, even though when they're all taken together, none of them need to be populated by the test bench. I'm not sure how to resolve this - a good first step might be at the end of get_all_module_units, to get rid of any duplicated memories. We would then have to somehow loop over the structure of the chains and figure out how to populate the is_initial and is_final members of the memory class, so that the scripts know whether those corresponding memories need to be populated by the test bench, or are internal to the top-level. Making this error-proof might be quite challenging. On the other hand, for some chains this will work fine out of the box. For example, since the PR-ME-MC chains don't overlap with each other, and there is a 1-1 correspondence between PRs and MCs, generating all PRMEMC chains should work fine with this. So I guess how we move forward depends on how soon we want this feature in the master, but I'd be reluctant to merge without at least making it clear that users have to be careful using this feature, and that it's not guaranteed to produce a consistent top-level.

This is a valid point and probably suggests we should print out a big caveat to the code when it is run in this mode. However, as I understand it, this will allow work to be done in our other goal (many chains in parallel) so I think it should go into master with the caveats mentioned above. As it is the likelihood than a non-expert will run this code in the near future is close to nil.

djcranshaw commented 3 years ago

This would be a great option, but I'm worried that as the code is set up now, the top-level vhdl generate will be incorrect in some cases. For example, PR_L3PHIC is connected to 8 MEs (ME_L3PHIC 17->24). If we try to make all PR-ME chains by using the --mut == ME option and setting -u 1, -d 0, the scripts will generate 8 copies of the each of VMProj_L3PHIC 17-24, instead of one of each copy. Also, these VMProj memories will all be flagged as initial memories, because for some of the chains, these memories will need to be populated by the test bench, even though when they're all taken together, none of them need to be populated by the test bench. I'm not sure how to resolve this - a good first step might be at the end of get_all_module_units, to get rid of any duplicated memories. We would then have to somehow loop over the structure of the chains and figure out how to populate the is_initial and is_final members of the memory class, so that the scripts know whether those corresponding memories need to be populated by the test bench, or are internal to the top-level. Making this error-proof might be quite challenging. On the other hand, for some chains this will work fine out of the box. For example, since the PR-ME-MC chains don't overlap with each other, and there is a 1-1 correspondence between PRs and MCs, generating all PRMEMC chains should work fine with this. So I guess how we move forward depends on how soon we want this feature in the master, but I'd be reluctant to merge without at least making it clear that users have to be careful using this feature, and that it's not guaranteed to produce a consistent top-level.

This is a valid point and probably suggests we should print out a big caveat to the code when it is run in this mode. However, as I understand it, this will allow work to be done in our other goal (many chains in parallel) so I think it should go into master with the caveats mentioned above. As it is the likelihood than a non-expert will run this code in the near future is close to nil.

Sure, I'm fine with that. I think we should print out some sort of warning when this option is used to remind ourselves that this can happen, but once that's done and the few small changes requested from the other comments are done I'm happy for this to go into master.

tomalin commented 3 years ago

This branch doesn't work with the HLS master, making it tricky to test. Please use "git merge" to update it with recent changes from the project_generation_scripts master. It also gives errors when run with the old "git checkout d49e68f5bc0d37e3d8ec9e231927f73060af21d2" version of the HLS code, so I'm unsure which version of HLS it's targetting? Some of the error messages are surprising, such as "Disk MatchEngine is not supported yet!", when I asked it to generate a barrel chain with "./generator_hdl.py --uut PR_L3PHIC -u 0 -d 2".

fatimayousuf commented 3 years ago

This branch doesn't work with the HLS master, making it tricky to test. Please use "git merge" to update it with recent changes from the project_generation_scripts master. It also gives errors when run with the old "git checkout d49e68f5bc0d37e3d8ec9e231927f73060af21d2" version of the HLS code, so I'm unsure which version of HLS it's targetting? Some of the error messages are surprising, such as "Disk MatchEngine is not supported yet!", when I asked it to generate a barrel chain with "./generator_hdl.py --uut PR_L3PHIC -u 0 -d 2".

This branch should already be updated with master. The command that I mainly have been using to test this is ./generator_hdl.py -p processingmodules.dat -m memorymodules.dat -w wires.dat --mut PR -u 0 -d 2. This would look at the projection routers, VMProjections, and MatchEngines in all regions (barrel+disk). I thought those errors arose from existing problems in the code. For example, I assumed that "Disk MatchEngine is not supported yet!" came up because people were still working on the MatchEngine code. I can look into this a bit more, though.

tomalin commented 3 years ago

The message "not guaranteed to produce a consistent top-level" suggests that results from this option can't be trusted. Can we be more specific about when it fails? If I run code with "./generator_hdl.py --mut ME -u 1 -d 0" option that Peter mentioned, the SectorProcessor.vhd written instantiates 288 VMPROJ memories, which agrees with the number that memorymodules.dat says there should be. However, some intermediate signals such as vmproj_l3phib16_datarray_data_v_dout are incorrectly sent over the SectorProcessor.vhd interface. This error doesn't happen if "./generator_hdl.py --mut PR -u 0 -d 1" is used.

djcranshaw commented 3 years ago

The message "not guaranteed to produce a consistent top-level" suggests that results from this option can't be trusted. Can we be more specific about when it fails? If I run code with "./generator_hdl.py --mut ME -u 1 -d 0" option that Peter mentioned, the SectorProcessor.vhd written instantiates 288 VMPROJ memories, which agrees with the number that memorymodules.dat says there should be. However, some intermediate signals such as vmproj_l3phib16_datarray_data_v_dout are incorrectly sent over the SectorProcessor.vhd interface. This error doesn't happen if "./generator_hdl.py --mut PR -u 0 -d 1" is used.

The main situation where the --mut option fails is when, in the selected chain, the modules 'fan-in' when travelling either upstream or downstream starting from the chosen central module. Fanning-out is fine. So how about a message like "WARNING: When using --mut option, if there is any single module downstream of <--mut option> in the specified chain whose inputs are written by multiple instances of <--mut option> modules, or if there is any single module upstream of <--mut option> in the specified chain whose outputs are read by multiple instance of <--mut option> modules, then this option will not produce a consistent top-level VHDL module." This I think is precise, if a bit wordy. Does this sound reasonable?

tomalin commented 3 years ago

Please remove warning "Disk MatchEngine not supported", since (as discussed in Skype), no longer needed.

tomalin commented 3 years ago

Not producing a "consistent top-level" is a bug, so we should fix it, rather than printing a warning. I suspect adding the following lines at https://github.com/cms-L1TK/project_generation_scripts/blob/top_level_option/generator_hdl.py#L377 would do it:

Correct mem.is_initial & mem.is_final, as loop over mutModules overwrites them incorrectly.

for mem in memory_list: if mem.is_initial: for proc in mem.upstreams: for p in process_list: if proc.inst == p.inst: mem.is_initial = False if mem.is_final: for proc in mem.downstreams: for p in process_list: if proc.inst == p.inst: mem.is_final = False

djcranshaw commented 3 years ago

Not producing a "consistent top-level" is a bug, so we should fix it, rather than printing a warning. I suspect adding the following lines at https://github.com/cms-L1TK/project_generation_scripts/blob/top_level_option/generator_hdl.py#L377 would do it:

    # Correct mem.is_initial & mem.is_final, as loop over mutModules overwrites them incorrectly. 
    for mem in memory_list:
        if mem.is_initial:
            for proc in mem.upstreams:
                for p in process_list:
                    if proc.inst == p.inst:
                        mem.is_initial = False
        if mem.is_final:
            for proc in mem.downstreams:
                for p in process_list:
                    if proc.inst == p.inst:
                        mem.is_final = False

I don't know if this solves all possible sources of inconsistency with using the --mut option but it does appear to solve the ones I was describing, so I'm in favour of adding these lines. I would prefer though if we could break out of the inner most two layers of the loop once we change mem.is_inital or mem.is_final to save time, and especially when running this option the number of modules could be quite large and the time savings could be significant.

rzouCERN commented 3 years ago

Not producing a "consistent top-level" is a bug, so we should fix it, rather than printing a warning. I suspect adding the following lines at https://github.com/cms-L1TK/project_generation_scripts/blob/top_level_option/generator_hdl.py#L377 would do it:

    # Correct mem.is_initial & mem.is_final, as loop over mutModules overwrites them incorrectly. 
    for mem in memory_list:
        if mem.is_initial:
            for proc in mem.upstreams:
                for p in process_list:
                    if proc.inst == p.inst:
                        mem.is_initial = False
        if mem.is_final:
            for proc in mem.downstreams:
                for p in process_list:
                    if proc.inst == p.inst:
                        mem.is_final = False

As pointed already by Peter the goal of this initial attempt at adding --mut function is so that we can proceed to generate a top level with a lot of PRMEMC chains. We can make the warning to say explicitly this only has been worked out for PRMEMC chain. We didn't mean to aim to cover all cases with this first commit. Since the master script doesn't even work for some of the other chains that this script will have issue with, I would prefer that we create an issue regarding this flag and move the discussion on how to solve this there. And move forward with this initial commit.