Closed mileslucas closed 2 years ago
Awesome! Thanks for reviewing the paper/package. I am really happy to get some more eyes on the package. I'll start addressing these comments as they come in.
Modeling with non-analytic Fourier transforms
I got through this tutorial just fine, this time. I really like your approach for the composite models and the hierarchy of primitive types.
I'm curious here why not use a product of 2, 3, 5, or 7 (e.g. nextprod((2,3,5,7), padfac * nx)
). It's my understanding FFTW can handle these sizes efficiently, more than just the powers of two.
Just wanted to point out if you try and copy-paste these tutorials, they will fail due to the name clash between Comrade.Gaussian
and the Gaussian
struct created. May I suggest using MyGaussian
for the tutorial, or something like that, to keep the scope clean.
There's a lot of docstrings that are slightly mangled, e.g. 1, 2, 3, 4, etc. I think a quick pass through all the docstrings and tidying them up so there's a clear function signature formatted appropriately. You might look for referenced functions, too, for example here should reference renormed
but it's spelled renomed
. If it's [](@ref)
you'll at least see the broken link in the documenter build logs.
The repository/docs are missing community guidelines
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support?
Adding something to the README and the docs homepage about where to file issues, any standards (e.g. code style, ColPrac), etc is all you need.
L22: Given the diverse nature of AGN and black holes
AGN acronym not introduced (could be introduced in L7)
L23: Gaussians, disks, rings, crescents to extract relevant features
"disks, rings, and crescents"
L25: flexible model interface, enables direct
"flexible model interface , enables direct"
L34: just-ahead-of-time compiler, and code specialization
"just-ahead-of-time compiler , and code specialization"
Overall great writing- you clearly listed the motivation for Comrade.jl and discussed all the challenges with VLBI imaging. Perhaps some more discussion about the state of the field could be warranted, either in the paper or in the docs. For example, what is the maturity of modeling with Comrade.jl vs eht-imaging?
Are there any benchmarks? You mention in the README the want for speed and proper SIMD, GPU, and distributed support. What is the current status of those goals?
Thank you for all your comments! Time for me to get to work.
Comrade
attempts this imaging uncertainty this sentence is missing a verb, I think?
I edited the first part of the introduction to make it a little clearer.
I have fixed the broken link.
I got some warnings
This should be fixed now on master. Thanks for catching that.
I think it would be clearer if you put the package imports where they are used- for one, if the user isn't interested in, say, the AHMC part of the tutorial, then they won't need to install/load the package...
I like this a lot! I am used to putting my using
or import
statements at the top of my files, but for tutorials this is a good strategy.
Tangential note: There's something flagrantly un-Julian about having to make 5 unique packages for 5 unique optimization algorithms when the language is build on composability...
I completely agree and I have struggled with this myself. Originally I hadn't interfaced with any sampling packages and just made some primitives that would make it easier to interface with samplers, i.e. auto parameter transforms to R^n or the unit hypercube. However, I found that this was asking a lot for my expected user base where a lot of them are coming from python so having something more integrated was useful. I had originally used Requires.jl but that isn't great since I couldn't put version bounds in, so then I settled on my current strategy. In the future, I may move back to my original plan.
not clear what "# from EHTC VI 2019" is supposed to mean, and extraneous period.
Fixed!
This all came from an old version of ExtendedRing
that included the radius as one of its parameters. I had changed this to just be
m = ExtendedRing(8.0)
# change diameter
stretched(m, 10.0, 10.0)
to fit more in line with the general philosophy of the package. With this change, the plot
error goes away.
I also fixed the modelimage part of the tutorial. This part of Comrade has changed quite a bit recently.
I'm curious here why not use a product of 2, 3, 5, or 7 (e.g. nextprod((2,3,5,7), padfac * nx)). It's my understanding FFTW can handle these sizes efficiently, more than just the powers of two.
This is a good idea! Fixed in #147
I renamed all the structs to MyGaussian
Yes. I was pretty sloppy here. I'll go through all of them and start fixing them in #148
I've added this to the README and a brief statements in the docs.
I'll update the typos ASAP.
A comparison to eht-imaging is interesting since it partially uses pretty different methods. eht-imaging uses regularized maximum likelihood (RML) imaging methods to produce images with an objective function
$$ J(I, \alpha, \beta | D) = \sum_k \alpha_k \chi^2_k(I, D) + \sum_k \beta_k R_k(I) $$
where $\chi^2$ are data products related to the usual $\chi_k^2$ and $R_k$ are image regularizers. This is different from Comrade
in a few ways. First, the $\alpha$ are usually tuned to some value which makes it difficult to interpret as a usual data likelihood. The $\beta$ parameters are then tuned as well. This get quite detailed and is a bit of an art so I don't think it will fit into the paper. I can include a sentence or two explaining the difference and then cite some other papers for more information.
You mention in the README the want for speed and proper SIMD, GPU, and distributed support.
That part of the README was pretty old so I removed it. GPU support is something planned but it will require a bit of work so I don't want to promise anything yet.
I can try to put some benchmarks compared to eht-imaging modeling pacakge later this week. I have another larger and flashier benchmark to an EHT package called Themis, but it is difficult since
@mileslucas alright here are some preliminary benchmarks. The benchmark evaluates the model I presented in the paper using both Comrade and eht-imaging
Comrade
julia> @benchmark visibilities($m, $u, $v)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 91.464 μs … 584.311 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 98.503 μs ┊ GC (median): 0.00%
Time (mean ± σ): 108.179 μs ± 28.756 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█▂▁▇█▂ ▂▅ ▃ ▁ ▁▁▁▁▁▁▁ ▁▂ ▂
█████████▇▇▆██▅▆▄▇█▄▄▂▃▃█▄▂▂▃▃▃▄▆▄▄▂▄▃▄▃▃▄▆▆▆▇▇██████████████ █
91.5 μs Histogram: log(frequency) by time 198 μs <
Memory estimate: 4.50 KiB, allocs estimate: 2.
eht-imaging
julia> @benchmark meh.sample_uv($u, $v)
BenchmarkTools.Trial: 4445 samples with 1 evaluation.
Range (min … max): 898.894 μs … 5.173 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.034 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.118 ms ± 240.445 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▄▁▁█▇▄▃▇▆▂▁▁▃▁ ▁▁▁▁▁▁ ▁
▅▃████████████████▆▆▇▆▇▅▃▆▅▆▁▁▄▄▄▄▃▃▅▄▁▃▃▄▁▃▁▅▄▃▅▇▆▇▇▇▇██████ █
899 μs Histogram: log(frequency) by time 1.98 ms <
Memory estimate: 6.22 KiB, allocs estimate: 39.
So that is an order of magnitude speed up! I am actually a bit surprised that it is so much since the major computational cost should just be bessel functions, but hey I guess compiling does matter!
Here are some more benchmarks. I looked at the 2017 M87 EHT data and looked at likelihood evaluation and gradient times:
Comrade (micro sec) | eht-imaging (micro sec) | Themis (micro sec) | |
---|---|---|---|
likelihood (min) | 31 | 445 | 55 |
likelihood (mean) | 36 | 476 | 60 |
grad likelihood (min) | 105 (ForwardDiff) | 1898 | 1809 |
grad likelihood (mean) | 119 (ForwardDiff) | 1971 | 1866 |
This is actually really really good. I am specifically pleased with the Themis comparison since it is written in C++. This is probably because of the different bessel function implementation since Themis uses a handwritten one, while Comrade uses the one from SpecialFunctions.jl. This is actually by design since Themis was written to have as few dependencies as possible so most of the code was rewritten from scratch (I wrote a bunch of it and tbh it isn't great). On the other hand Comrade tries to use the Julia ecosystem as much as possible.
The gradient time are also a little surprising since eht-imaging uses handwritten gradients for everything so that fact that ForwardDiff.jl is doing so well is encouraging. For Themis it's poor benchmark was expected since it doesn't compute gradients using autodiff or handwritten but instead relies on a finite difference scheme which is pretty damn slow.
@ptiede those are some nice benchmark results! I think you could just paste those outputs into your docs (maybe Home/Benchmarks or maybe a separate page) with a snippet of some of your versioninfo()
output to see machine CPU and Julia build. Bonus points if you put the benchmark code in the repo and provide a little README for how to replicate, but I don't see that as necessary.
With the contribution guidelines checked off, you've got all the pieces done for the JOSS paper. Once you merge in those PRs I'll close out my review with JOSS.
I added the benchmarks to the docs and included some code to reproduce the results for eht-imaging and Comrade. Everything is merged and passing tests! Thanks for all your hard work! The package is much much better now.
Awesome, it's looking great! Thanks for contributing to the Julia astronomy ecosystem, I'm always happy to see the community grow :) cheers
necro-linking the JOSS issue https://github.com/openjournals/joss-reviews/issues/4457
Hey @ptiede, happy to be reviewing this JOSS submission. I'll be using this issue for working discussions and problems I find!
I'll start with doc typos I've found; this is what I've caught while reading through-
this sentence is missing a verb, I think? https://github.com/ptiede/Comrade.jl/blob/79b66f8e1d06ef933e2148cbc8e26fab8e503bc3/docs/src/vlbi_imaging_problem.md?plain=1#L3
Loading Data into Comrade
dead link, should point to
achael/eht-imaging
https://github.com/ptiede/Comrade.jl/blob/79b66f8e1d06ef933e2148cbc8e26fab8e503bc3/docs/src/examples/data.jl#L17During this part of the first tutorial
I got some warnings
that seem like they can be addressed.
Making Image of a Black Hole
Very minor comment: for this tutorial, I think it would be clearer if you put the package imports where they are used- for one, if the user isn't interested in, say, the AHMC part of the tutorial, then they won't need to install/load the package. Second, it makes it super obvious that "I need this to enable these things", like you don't need
Distributions.jl
until you create the priors, and you don't needGalacticOptim.jl
andComradeGalactic.jl
until you find the MAP estimate.(Tangential note: There's something flagrantly un-Julian about having to make 5 unique packages for 5 unique optimization algorithms when the language is build on composability. I don't have a solution, and this is fairly commonplace, but it's obviously a lot of extra work for you as a maintainer and is annoying at minimum as a user)
Problems:
not clear what "# from EHTC VI 2019" is supposed to mean, and extraneous period.
Modeling with non-analytic Fourier transforms
seems to be invalid syntax, should be
seems like this part of the tutorial is out of date.
Taking a break for now, will keep adding as I continue reviewing the code!