mec-UMN / HISIM

MIT License
12 stars 3 forks source link

Explanation of how the CSV files for AI networks are generated #2

Closed jzhou1318 closed 2 months ago

jzhou1318 commented 3 months ago

From what I understand, the AI network parameters are designed to represent convolution layers best. Other linear layers or attention layers are mapped onto this representation. Do I have the right understanding?

Can the vision transformer representation be explained in more detail? How does this representation capture the parallelism of multi-headed attention?

pragnyan948 commented 3 months ago

Hello Jennifer,

Thank you for your query!

Yes, AI network parameters are used to represent convolution layers but linear layers or attention layers can also be mapped in the representation by choosing the Kx, and Ky is chosen as 1. The AI network model for ViT at location https://github.com/pragnyan948/HISIM/blob/main/Module_AI_Map/AI_Networks/Transformer/VIT_base.csv is explained below

image

As for the parallelism of the multi-head, it depends on the NoC, NoP dataflow. It is currently not captured. This will be implemented as part of future work.

Please let me know if there are any other queries. Thank you!

Regards, Pragnya

jzhou1318 commented 3 months ago

Can I assume the batch size and stride for convolution representations are 1? What about padding techniques?

And for the transformer specifically, what does the last column represent? It doesn't seem to be used in the code-base. Is it just for book-keeping?

Thanks!

pragnyan948 commented 3 months ago

Yes, please assume batch size and stride as 1. The stride for all the layers can be altered in functions.py in the Module_Compute folder. The last column is not used in the code base. Please ignore it.

We still need to implement padding techniques in the codes. Are there any specific padding techniques you are looking for? We will include these features in the scope for the next releases.

jzhou1318 commented 3 months ago

I'm not looking for any specific techniques at this moment; just trying to understand how HiSim runs AI models in general.

Changing the stride number in function.py would cause the Conv representations for the AI models to change as well, correct? Are these representations manually written at the moment or is there a script that generates them automatically? And is there a way for the representations to be validated? I've been looking at the ViT model and I notice some disreptancies from layer to layer.

Thanks!

pragnyan948 commented 3 months ago

Yes, please feel free to customize the code by editing the line: "self.st = 1" as "self.st=network_params[layer_idx][8]" and use the last column of network.csv to represent stride. You can then specifically change the stride of particular layers in the network.csv accordingly.

These representations are written manually. We will provide a script in the future. Please do let us know if there are any discrepancies in the network.csv files. Thank you!