Open DanP114 opened 5 months ago
There's something odd. Your spec appears to be creating a 200x200 array but the stats.txt reports 16x16 instances at all inner levels of the hierarchy. Are you sure the stat dump is from this arch?
Overall a 200x200 array is hard to fill spatially. Most mappings will be underutilized, so I suspect the mapper search is just giving up too quickly. Try tweaking the hyperparameters to make it try harder. Also, in your innermost buffer constraints you should add a min parallelism constraint (e.g., 0.5). This will early-reject any mappings that don't have at least 50% utilization. You won't prevent the search heuristic from visiting such mappings, but you will elide the expensive evaluation cost for these mappings.
There's something odd. Your spec appears to be creating a 200x200 array but the stats.txt reports 16x16 instances at all inner levels of the hierarchy. Are you sure the stat dump is from this arch?
Overall a 200x200 array is hard to fill spatially. Most mappings will be underutilized, so I suspect the mapper search is just giving up too quickly. Try tweaking the hyperparameters to make it try harder. Also, in your innermost buffer constraints you should add a min parallelism constraint (e.g., 0.5). This will early-reject any mappings that don't have at least 50% utilization. You won't prevent the search heuristic from visiting such mappings, but you will elide the expensive evaluation cost for these mappings.
Hello,
I have a question about how to add a min parallelism constraint (e.g., 0.5) in my innermost buffer constraints. Can you give me an example?
Thanks.
Hello there,
I am currently working on designing a 200 by 200 PE Convolution accelerator. I have taken the base template from the exercise provided and read through some documentation but my mapping strategies return with about 1-2% utilization.
Here are my input architecture files, parsed_input, generated map, and statistics showing utilization.
My inner PE spatial loop bounds seem to only unroll along the Y-axis with nothing in the X-axis. I believe the issues come from the constraints definition but I also have the intution problem dimensions (VGG) are not suited for a large PE array hence why I try mapping more batches.
Any input is appreciated.
arch_conv.txt parsed-processed-input-large-pe-array-multi-batch.txt
timeloop-mapper.stats.txt timeloop-mapper.map.txt