Open editorialbot opened 7 months ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.88 T=0.25 s (475.2 files/s, 457039.5 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Rust 97 9275 5579 99433
TOML 12 100 1 572
Markdown 5 118 0 506
C++ 1 34 8 388
YAML 3 21 1 115
TeX 1 0 0 114
Python 1 21 0 75
HTML 1 0 0 15
-------------------------------------------------------------------------------
SUM: 121 9569 5589 101218
-------------------------------------------------------------------------------
gitinspector failed to run statistical information for the repository
Wordcount for paper.md
is 746
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1145/2503210.2503219 is OK
- 10.1145/2382585.2382587 is OK
- 10.1145/1391989.1391995 is OK
- 10.1016/b978-0-12-803761-4.00011-3 is OK
MISSING DOIs
- 10.31003/uspnf_r2491_01_01 may be a valid DOI for title: Rayon
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Hi, @FCLC @mhoemmen 👋 Welcome to JOSS and thanks for agreeing to review! The comments from @editorialbot above outline the review process, which takes place in this thread (possibly with issues filed in the faer repository). I'll be watching this thread if you have any questions.
The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention this issue so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.
We aim for reviews to be completed within a month or two. Please let me know if you require some more time. We can also use editorialbot to set automatic reminders if you know you'll be away for a known period of time.
Please feel free to ping me (@jedbrown) if you have any questions/concerns.
The summary is concise and explains the library well to me. For example, "dense linear algebra library" is explained as "matrix decompositions and solving linear systems." ("Linear algebra library" means different things to different people, so having the explanation is helpful.)
It would be interesting to know what inspired the name "faer."
The statement of need is compelling. The following quote conveys the motivation well.
Aside from faer, the Rust ecosystem lacks high performance matrix factorization libraries that aren’t C library wrappers, which presents a distribution challenge and can impede generic programming.
The "impede generic programming" phrase is enough justification for me. I've encountered C++ libraries that exist only for this reason, to make it possible to factor sparse matrices that aren't supported by popular C libraries.
Do the authors describe how this software compares to other commonly-used packages?
The author briefly mentions and cites popular C, Fortran, and C++ (Eigen) software packages. However, it could be helpful to learn more about the library's interface design and how that compares to other packages. This is another way in which "linear algebra library" can mean different things to different people. For example, is it a "Matlab in Rust" (like the "Matlab in C++" packages Eigen and MatX), with overloaded operators and short functions, that doesn't worry so much about the cost of allocating temporary matrices or vectors? Is it a collection of stateless algorithms, like P1673 a.k.a. std::linalg (voted into the current C++26 draft)?
Apologies in the delays to review, only received permission from my employer to review as of this morning! I'll be looking to provide initial comments in the coming days and weeks.
@jedbrown did you have a publication date target?
Thanks @FCLC. We try to complete reviews within two months if no major changes are needed by the authors. It sounds like that'll be workable for you?
@mhoemmen :wave: How is your review going? I see you had a question about allocation tucked in the end of your comment. Faer supports no_std
, which is an environment without allocation or a standard library. In that sense, it is low-level. I think you'll find ergonomics of the high-level interfaces are intermediate, albeit safe.
@jedbrown Work has been a lot lately, so I've been running late on this one. Thanks for the clarifying comment! I'll finish this review as soon as I can.
@FCLC @mhoemmen :wave: Hi, could you let us know your timeline for finishing your review? Are there any hold-ups that Sarah or I could help with?
@jedbrown just purely what I've discussed offline.... I will get to this as soon as I can.
Reviewing the paper itself, I have a few comments, organized by line number (L{number})
L11: Worth clarifying that faer can operate on non symmetric Mats, as well as other categories it can and cannot natively handle? (Triangular? Hermitian?)
L14: Which standard do these floats adhere to? (presumably that's IEEE754 in either the 1985 or the 2008 flavour?). Complex types are mentioned, are they equivalent to the complex types defined in the reference BLAS? (they are later discussed at L28-31)
L15: The idea of universal type support is interesting, what can this be extended to? Integer maths? Fixed point? Posits? Fundamentally, could a user map any type that Rust/LLVM can use and have it be decently performant? Why or why not?
L18: SIMD is decently well documented online, but is not very well known outside of low level optimization circles; it may be helpful for higher level devs to provide the acronym if not a short description?
L22-26: I'd agree that a Rust native matrix library is useful, but can this rust library be actively built and used from C native projects? If so, does it share a common API/ABI (LAPACK, BLAS etc.?) to be useable from those other languages?
L24-26: Does the lack of a Rust ABI present portability challenges to faer? Is there an expectation that any users of faer will always have to rebuild faer from source between versions? If so, being as linear algebra libraries tend to be heavily leveraged in HPC contexts/environments, does the usage of rust mandate a change in approach to (sub)versioning?
L29: To help those not intimately familiar with performance engineering, is it worth adding a brief sentence explaining why contiguous memory is valuable? Even something as simple as a mention of strided/aligned memory accesses?
L38: How is CPU feature detection handled? AOT compiled multi dispatch for a list of targets, local target only compilation, etc.? Is this detection reliant on cargo or does it make use of other/homegrown methods?
L41-45: it might be worth having a glossary or spelling out what these terms mean?
L47: Brief explanation of what is meant by "Optimized fuzed kernels"?
L57-61: since most performance metric for BLAS and BLAS adjacent libraries tend to express performance in terms of floating point operations per second, is it worth providing a table comparing faer vs various other implementations in terms of (giga) floating point operations per second?
L61: {Graphs} F128 is mentioned, but not "which" f128 is this? is this IEEE FP128? IBM/PPC FP128? 2xFP64 packed FP128 (where the leading FP64 is IEEE754 Binary_64 with an extended 64 bits treated as a uint softfloat)?
Beyond that: is it worth having a statement/line indicating what platforms are supported? For example, is faer a linux x86-64 only library? what about Arm64? MacOS? Windows? FreeBSD? Does rust/cargo being limited in HPC enviroments cause concerns for long term support of versions? or are users expected to use cargo/spack/EasyBuild to stay on top of releases?
i'll reply to the comments here for now, but let me know if i should also add these clarifications to the paper itself and rebuild
L11: could you explain what you mean by natively handle? there's algorithms for general, triangular, symmetric, hermitian matrices. (in the past year sparse matrix algorithms have also advanced quite a bit) the main ones that are missing afaik are banded matrices, and packed triangular/symmetric/hermitian storage, but that's not an inherent limitation of faer
L14: rust uses IEEE 754-2008 floating types for f32 and f64, complex types c32 and c64 are the same as float _Complex
, double _Complex
in C, but there's also a generic implementation for Complex<T>
that can be used with different float types
L15: integer maths is unlikely, but the most interesting scenarios i can think of would be stuff like extended/multiprecision floating point types, similar to what mpfr implements. or things like interval arithmetic, forward autodiff types (dual/hyperdual numbers), and any possible combinations of such types (Complex<Dual<BigFloat>>
should also work in most situations)
L18: i can add in a short description
L22-26: rust allows for interoperability with C by using the C abi on functions that are marked as extern "C"
(commonly also used with #[no_mangle]
so the symbol can be found by the linker as you'd expect)
i've used faer from C and C++ in the past and it works quite well
L24-26: users are expected to rebuild when upgrading versions, yes. but there's nothing preventing us or other people from distributing precompiled static/dynamic library files exposing a C api.
L29: i can add in an explanation
L38: cpu detection is done for predetermined targets, but each type can specify which targets it cares about. for example if a user wants to work with f16 (half) matrices, they can specify that they only want to try to detect something like avx512fp16 on x86
L41-L45: sure thing. i'll add that to the paper
L47: in this context it refers to computing things like A x
and A.T y
simultaneously, only traversing A
once, which can significantly speed up some memory-bound algorithms
L57-61: gflops/s works for matrix multiplication. but for more complex iterative algorithms like the eigenvalue decomposition, the flop count can depend on several factors and there's no closed form formula for it, so im not sure how to communicate that
L61: that's my bad, i should have clarified that it's f64x2 (commonly referred to as double double)
faer supports any platforms that rust itself supports. so it can run on the common operating systems, and the target architectures that rust supports
x86
x86_64
mips
powerpc
powerpc64
arm
aarch64
although vectorization is currently only supported on x86/x86_64 and aarch64 (no sve support at the moment), due to other platforms' intrinsics being currently unstable in rust
other than that, users are expected to use cargo to stay up to date, but they can build and distribute faer as a C api (still not currently exposed by faer itself, but users can easily expose the parts they care about) + static/dynamic library if the environment doesn't support building rust projects from source
L11: If it can handle them all, might be worth a quick mention in the paper? As a means of differentiating with a lot of toy implementations that don't bother implementing anything that isn't: GEMM, GEMV, DOT, and AXPY. faer is very rich in its feature set, but I don't know that it comes across? It might just be my reading.
L:14: it might be worth mentioning that the library/rust uses 754:2008 in that case, especially for those users that might want to use rust as their math library with another language? The 2008 part in specific allows for floating point contraction explicitly, which some languages may not expect, but can effect things like bit pattern. (I'll leave the for better or worse discussion on FP contraction aside!)
L15: Interesting!
L18: sounds good!
L22-26 && L24-26: that's great news! Perhaps a line to that effect would be useful? It might be out of scope for this paper, but does using the extern C interface means a non rust expert could extern C there way all the way towards a fortran wrapper?
L29: Ack
L38: FP16 is interesting in how little it's supported in HW for only a very small subset of x86 targets (ignoring the f16c ISA extension). In the case of something not being supported in HW, does that fall back to an LLVM softfloat? (sort of related to L15)
L41-45: ack
L47: ack, maybe worth a quick line to that effect?
L57-61: Fair enough!
L61: Ack
L{end}: Does the lack of intrinsic support mean that FAER/Rust will not be competitive for those platforms in the short/medium term?
Jed met with me today to go through the library and help me get started downloading it. I've had some "editor's block" about starting the review. My apologies to the author, who has waited far too long on it. I didn't feel ready because I didn't understand Rust well enough, but Jed helped me understand a tiny bit so that I wouldn't be completely lost. He also emphasized that I should focus on the library's functionality independent of whether the library is idiomatic Rust code.
I'll add some remarks here from my conversation with Jed today.
Jed explained that Sarah wrote a clean-room reimplementation from the algorithms, rather than porting existing code to Rust or wrapping existing non-Rust libraries. This is an extraordinary effort. From an algorithm development perspective, it was comparable to that of developing EISPACK and LINPACK from the Handbook. From a software library development perspective, it was actually more effort, because the idioms for managing views of subsets of multidimensional arrays in languages like Fortran or C++ do not carry over naturally to Rust. This means that Sarah couldn't just port (say) LAPACK, because Rust doesn't come with a way to say things like "create a view of rows 10-15 and columns 20-29 of this matrix." As a result, Sarah had to develop her own data structures, such as a matrix container (a C++ word meaning something like "data structure owning an allocation") and mutable and immutable "reference" (C++ might say "subview") types. Along with this, Sarah had to come up with new idioms, like what an "in-place factorization" means. (Instead of "operate in place on a view of the matrix" as in LAPACK, it means "consume this matrix and produce a new matrix that possibly reuses the storage of the consumed matrix.")
Thanks to Jed's help, I was able to build (is that even the right word?) and run the library's tests. The experience was completely without hassle. The hardest part by far was convincing my Linux system to install Rust, and that wasn't even so hard. The test output was extensive and gave a tour of the library's features. It also showed that one test (the matrix multiply test) didn't run, because it would have taken too long on my system. Presumably it's possible to enable that test by hand.
Regarding
Functionality: Have the functional claims of the software been confirmed?
I have to learn how to navigate the code repository in order to do this, because I don't know a priori how to tell where the tests live. There's no "test" directory, for example. git grep test
has led me to src/lib.rs, which appears to be the main entry point to the library. I see extensive human-readable documentation comments and some code constructs like #[cfg(test)]
and #[test]
. I'm guessing that the latter are embedded tests.
Yeah, this is the pattern used for unit tests. It causes this code to be compiled out for regular use (in which case cfg(test)
is false), but automatically put it in the test harness when running tests.
#[cfg(test)]
mod tests {
#[test]
fn test_name() {
// ...
}
}
You can also see how things are tested in CI (which includes running the test suite with code coverage) in .github/workflows/run-tests.yml
. Note that there is also some static analysis/linting in CI.
src/lib.rs documentation begins with matrix creation, slicing, and overloaded operators. The functionality looks familiar both to Matlab users and to users of comparable C++ libraries like Eigen.
I do wish the documentation had explained what this mat!
thing is, though perhaps "foo!
creates a Foo
" is a Rust idiom with which I am unfamiliar. Jed went into some detail about how this is a macro that avoids creating a temporary to hold the literal input. I've written something just a bit comparable in C++ for the mdarray
proposal (P1684), namely a nested initializer_list<initializer_list<value_type>>
constructor.
Expressions like the following in the documentation appear to act like C++ expression templates, in that they are nonowning views of existing matrices that represent matrix arithmetic expressions. Rust doesn't have the dangling references problem that classical C++ expression templates have. Rust's syntax also makes it more clear (contrasted with the "auto
return type problem" that e.g., Eigen has) that the type of something like add
below is a reference, not a new allocation.
let add = &a + &b;
let sub = &a - &b;
let scale = scale(3.0) * &a;
let mul = &a * b.transpose();
I'm curious about evaluation of non-elementwise expressions like &a * b.transpose()
. Is that an elementwise multiplication or the usual matrix multiply? The documentation says "*
for either scalar or matrix multiplication depending on the types of the operands." If *
can represent matrix-matrix multiplication, then what happens with expressions like &a * &b * &c
? Does the implementation attempt to solve the matrix chain multiplication problem? Are these expressions always lazily evaluated? Do they need to allocate intermediate temporary storage, and if so, do they attempt to reuse such storage for the various products in the chain?
Regarding the indexing expression let a00 = a[(0, 0)];
in the documentation, Jed explained that Rust does not permit array indexing expressions to have more than one argument. I was happy to be involved in C++23's removal of that restriction, so that mdspan
and mdarray
can say e.g., a[0, 0]
. This is a language restriction, not a library restriction, though.
For the factorizations listed in src/lib.rs, I'll separate my discussion into "SVD or eigendecomposition" and "everything else" (the "one-sided" factorizations).
Regarding "everything else," I'm pleased to see some harder-to-implement but more interesting factorizations, like Bunch-Kaufman decomposition, QR with column pivoting (useful for applications like hierarchical matrices), and LU with full pivoting.
Regarding "SVD or eigendecomposition," it's a bold choice for a library to include those. These are much trickier to implement and test. Even LAPACK's eigendecompositions have evolved quite a bit in the last 15-20 years, including entirely new algorithms (e.g., MRRR) in the symmetric / Hermitian case. It's possible that LAPACK is the only source for well-maintained, well-tested implementations of the newer algorithms (though I haven't checked Eigen). It looks like faer has functionality comparable to LAPACK's "standard" drivers, either _geev
, _syev
, or _heev
(all of which do QR iteration). Considering how difficult these algorithms are to implement and test, it's reasonable for a single-developer effort not to cover the full space of problems (e.g., generalized eigenproblem) and algorithms.
I've been going through the tests. It's great to see both the usual random matrix tests, and special cases, including matrices of all NaNs.
Something that I miss, particularly for the SVD and eigendecompositions, is a discussion of test provenance. The experience of the QR algorithm is "we had to do these funny things to avoid known bad cases," so specific test cases matter a lot. For increasing test coverage, it may pay to go through the LAPACK test suite and the LAPACK Working Notes (LAWNs).
Submitting author: !--author-handle-->@sarah-ek<!--end-author-handle-- (Sarah El Kazdadi) Repository: https://github.com/sarah-ek/faer-rs Branch with paper.md (empty if default branch): Version: v0.15.0 Editor: !--editor-->@jedbrown<!--end-editor-- Reviewers: @fclc, @mhoemmen Archive: Pending
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@fclc & @mhoemmen, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jedbrown know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @mhoemmen
📝 Checklist for @FCLC