cucapra / dahlia

Time-sensitive affine types for predictable hardware generation
https://capra.cs.cornell.edu/dahlia
MIT License
130 stars 8 forks source link

Reading list #27

Closed rachitnigam closed 4 years ago

rachitnigam commented 6 years ago

Tracking various relevant papers and articles.

rachitnigam commented 6 years ago

FPGA Programming for the masses:

sampsyo commented 6 years ago

Nice! Here are a couple more overviews in a similar vein:

rachitnigam commented 6 years ago

High level languages for FPGA programming:

rachitnigam commented 6 years ago

Pragmas: Brief descriptions and issues that our type system can solve.

rachitnigam commented 6 years ago

Formal semantics for HDLs:

rachitnigam commented 6 years ago

Types! Capabilities! Greek symbols!

rachitnigam commented 5 years ago

http://halide-lang.org/: Halide is a programming language designed to make it easier to write high-performance image and array processing code on modern machines.

Halide has some interesting abstractions for fast array and image processing. Worth taking a look at it to see if we can learn some interesting abstractions for seashell.

sampsyo commented 5 years ago

On Halide:

sampsyo commented 5 years ago

Bonsai is the closest related work I know of at the intersection of type systems and program synthesis. Its goal is to find bugs in the soundness of type systems by synthesizing programs that are well-typed but fail at run time.

sa2257 commented 5 years ago

Vaguely relevant papers in OOPSLA and MICRO 18?

  1. AnyDSL: A Partial Evaluation Framework for Programming High-Performance Libraries https://2018.splashcon.org/event/splash-2018-oopsla-anydsl-a-partial-evaluation-framework-for-programming-high-performance-libraries https://anydsl.github.io

  2. Rethinking the Memory Hierarchy for Modern Languages https://www.youtube.com/watch?v=XDpttL5_JIQ http://people.csail.mit.edu/sanchez/papers/2018.hotpads.micro.pdf

  3. TAPAS: Generating Parallel Accelerators from Parallel Programs https://www.youtube.com/watch?v=Z6p-hClfg8k

And our own

  1. An Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware https://www.youtube.com/watch?v=SKMGP3VaPWY https://www.csl.cornell.edu/~tchen/files/parallelxl-micro18.pdf
rachitnigam commented 5 years ago

Best Effort FPGA programming

Generating Configurable Hardware from Parallel Patterns

rachitnigam commented 5 years ago

https://hipacc-lang.org/.

Another Halide-like programming language for writing image processing kernels. The halide-HLS paper points out that HIPAcc generates C which it feeds to Vivado HLS to generate FPGA designs. Also, Halide-HLS itself seems to use this workflow.

rachitnigam commented 5 years ago

Lava: Circuit design in haskell: Seems to provide some interesting FP primitives to build circuit. From a cursory glance, the primitives and composition mechanisms seem more interesting and higher level than VHDL but not at the algorithmic design level (I might be wrong about that).

sampsyo commented 5 years ago

Just for even more breadth in funky/modern HDLs, PyRTL is another Python-embedded, operator-overloading-heavy example from Santa Barbara. The page also has a nice list of "Related Projects" (including PyMTL) at the bottom of the page.

rachitnigam commented 5 years ago

NESL programming language: http://www.cs.cmu.edu/~scandal/nesl.html

sa2257 commented 5 years ago

Papers Nitish shared from his work with Intel HLS on T2S, a language like Halide for spatial architectures. https://arxiv.org/ftp/arxiv/papers/1711/1711.07606.pdf

rachitnigam commented 5 years ago

From Xilinx manual: "A common issue when designs are first synthesized is report files showing the latency and interval as a question mark “?” rather than as numerical values. If the design has loops with variable loop bounds Vivado HLS cannot determine the latency"

We should be careful when implementing #63.

rachitnigam commented 5 years ago

Matlab HDL: https://www.mathworks.com/products/hdl-coder/features.html#generating-hdl-code

rachitnigam commented 5 years ago

EASY: Efficient Arbiter SYnthesis from Multi-threaded Code

The authors created a new pass for the LegUp-LLVM tool chain that allows them to eliminate unnecessary arbiters from a hardware design. LegUp has a simple, conservative method to figure out how many arbiters a specific design and memory partition scheme needs. The paper uses the MSFT Boogie verifier to prove unique access to memories by different threads (which seem to be equivalent to unrolled loops). When the tool can prove unique access, it eliminates the corresponding arbiter connection from the design.

The technique is not precise, i.e., it cannot find the optimal arbitration scheme and relies on SMT to prove uniqueness which might take arbitrarily long.

UNVERIFIED: Our type system is probably more precise than what they have since we support a strict subset of iteration patterns.

rachitnigam commented 5 years ago

SOAP: Structural Optimization of ArithmeticExpressions for High-Level Synthesis: Haven't read yet. They claim:

For the first time, we bring rigorous approaches fromsoftware static analysis, specifically formal semantics and abstractinterpretation, to bear on source-to-source transformation for high-level synthesis.

rachitnigam commented 5 years ago

HardCaml: Jane Street's library for producing FPGA designs in OCaml.

rachitnigam commented 5 years ago

FPGAs for the Masses. Section 2 has a nice introduction the state of HLS tools. Note that this is different from a similarly titled paper above.

rachitnigam commented 5 years ago

Silica: Language from Stanford Pat Hanrahan's group. The basic idea is taking python co-routines and turning them into FSMs that execute up to each yield in the program control flow. Essentially, a yield acts like a cycle boundary and the compiler can schedule all the other co-routines to execute cooperatively every cycle.

rachitnigam commented 5 years ago

Reconfigure.io is building a Go based HLS-as-a-service tool. Their compiler, Rio, uses LLVM to do the usual HLS scheduling stuff. The subset of Go used focuses on using channels and go routines. Tool seems beta.

rachitnigam commented 5 years ago

Thesis on optimizations for HLS optimizations. Papers worth reading to understand semantics for loops in HLS. Haven't read them myself.

https://github.com/Junyi-Liu/benchmarks-HLS

sa2257 commented 5 years ago

Some papers which try to avoid complexities of HLS

sa2257 commented 5 years ago

Adding in Bluespec reference manual http://www.bluespec.com/forum/download.php?id=157

rachitnigam commented 4 years ago

This is now embedded in https://rachitnigam.com/files/pubs/dahlia.pdf