dmsgnn / master-thesis

MLIR-based FPGA toolchain for Graph Neural Network acceleration using High-Level Synthesis. Developed for the Master of Science research thesis.
0 stars 0 forks source link

Thesis structure #5

Closed dmsgnn closed 10 months ago

dmsgnn commented 1 year ago

description

The aim of this issue is to list all the ideas about the structure of the thesis and its content.

how to compile

The thesis is written in Latex. It can be compiled using the following command

pdflatex -output-directory out Thesis.tex && bibtex out/Thesis.aux && pdflatex -output-directory out Thesis.tex && pdflatex -output-directory out Thesis.tex 

The compiled output, included the complete pdf version of the thesis, is available in the out folder.

dmsgnn commented 1 year ago

contents

The way I have in mind to structure the thesis is made of the following main chapters:

  1. introduction -> introduction to graph neural networks (mostly on the fact that they are nowadays used a lot), to what it will possible to see in the thesis (brief explanation on what was the purpose of the thesis) and why.
  2. background -> nice chapter about graph neural networks, with main focus on graph convolutional networks and graph isomorphism network, and a shorter discussion on other models (+virtual node, gan, ...)
  3. state of the art -> chapter where I am going to talk about the state of the art gnn accelerator, based on the literature review I am doing, explaining some limitations of them
  4. experimental results -> the main section of the thesis where I am going to explain everything that has been done, model used, practical implementations, frameworks and technologies used, ...
  5. conclusions -> final chapter where I will explain what the final result has been, a summary about the limitations encountered and the future work that could be done, based on the conducted research

The length of the chapters is estimated as follows:

  1. introduction -> 10 pages
  2. background -> 15 pages
  3. state of the art -> 20 pages
  4. experimental results -> 25 pages
  5. conclusion -> 10 pages

For a total estimated thesis length of 80 pages of pure content (abstract, bibliography, lists, ..., excluded).

dmsgnn commented 1 year ago

updated table of content

2/ Background (potential things to add)
    - Graph representation (adjiacency, CSR and COO) [done]
    - A bit of history about Graph Neural Networks [maybe]
    - MLIR and Torch-MLIR [done]
    - Sparse Tensors in MLIR [maybe]
    - High-Level Synthesis [done]
    - LLVM [maybe]

3/ Related Work (16 pages)
    - Chapter Structure (1 page)
    - Software frameworks and accelerators (2 pages)
    - Hardware accelerators (8 pages)
        * Unified architecture accelerators (3 pages)
            (Talk about EnGN and AWB-GCN)
        * GNN acceleration using Tiled architecture (1 page)
            (Talk about Auten et. al)
        * Hybrid architectures for GNN acceleration (3 pages)
            (Talk about HyGCN and GRIP)
    - High-Level Synthesis based accelerators (2 page)
        (Talk about GenGNN - mentioning BoostGCN - and DGNN-Booster)
    - Softrware-Hardware co-design accelerators (1 page)
        (Talk about the design from Zhang et al. + GCoD)
    - Graph processing acceleration using HBM-equipped FPGAs (1 page)
        (Talk about GraphLily)
    - Matrix-matrix multiplication optimization (1 page)
        (talk about state of the art libraries for cpu-gpu such as cublas and how they works and then
         I talk about the acceleration of matmul in mlir paper)
        * Matmul optimization in MLIR
    - Conclusion (1 page)
        (Talk about the difficulties of comparison between accelerators. Talk about OGB
         Talk about the fact that one fits all is difficult in GNN. Recall this secion in 
         The chapter where I discuss my motivation and about the aim to bring a design flow to
         build GNN accelerators for each architecture)

4/ Problem Formulation
    - GNN acceleration
    - State of the art limitations
    - Motivation

5/ FPGA Design Flow for GNN Acceleration
    - Design Flow
        * PyTorch
        * torch-mlir
        * soda 
        * PandA-Bambu
    - Limitations

6/ Experimental procedure and results
    - GNN models
        * OpenGraph Benchmark models
        * PyTorch Graph Convolutional Network
    - From PyTorch to JIT
        * one section for each macro change
    - Lowering to MLIR
    - SODA optimization
        * Matmul optimization
    - Synthesis
    - Numerical results
       (Talk about the time of matrix multiplication in PyTorch - increasing size, 15, 250, 1000, 2708 - versus the synthetized
        version. Then, explaining how this optimization can accelerate GCN inference show time difference betweeb PyTorch
        and Bambu version in increasing size. Another histogram will be about the difference of performance between the 
        2 channels version versus the 32 channels with external memory)

7/ Conclusion and future developments
    - Evaluation of the design flow
    - Efficiency of the generated designs
    - Future developments
dmsgnn commented 1 year ago

deadlines

These are the estimated deadline for each chapter. It is expected to have a finished thesis (small changes excluded) for the 20th of august (better) / 03rd of september.

Week from 21-08 to 27-08 not available due to holiday.