Does h-transformer-1d have a subquadratic cost?

In h-transformer-1d paper, google researchers said,

1. Introduction
...
In this paper, we draw inspiration from two branches in numerical analysis:
Hierarchical Matrixs (H-Matrix) and Multigrid method.
We propose a hierarchical attention that has linear complexity in run time and memory, ...

In section 6.2, they explained why h-transformer-1d has a linear complexity in depth!!

My question is, Why subquadratic cost?? Is this different versus linear complexity??

In linformer paper, they also propose linear complexity transformer model. However, In some sources(e.g. https://andre-martins.github.io/docs/dsl2020/attention-mechanisms.pdf slide number 107), Proposed attention to solving the quadratic bottleneck problem is often called subquadratic self-attention. (e.g. BigBird, Linformer, Linear Transformer, Performer)

What is difference between subquadratic cost and linear complexity on time and space? Can I understand that Luna also has a subquadratic cost?

lucidrains / h-transformer-1d

Does h-transformer-1d have a subquadratic cost? #7