timothyas / xesn

Large scale Echo State Networks powered by xarray and dask.
https://xesn.readthedocs.io
Apache License 2.0
1 stars 1 forks source link

[joss] Editorial paper proofreading requests :) #78

Closed sneakers-the-rat closed 3 weeks ago

sneakers-the-rat commented 4 weeks ago

Part of: https://github.com/openjournals/joss-reviews/issues/7286

Some very small proofreading changes from reading the paper. Overall it's great, just some suggestions that might improve the strength of the paper. The minor suggestions are up to you, take them or leave them, but I would like some clarification on the medium/general suggestions/questions for the purposes of making the paper as strong as it can be :)

Minor/grammatical suggestions:

Medium questions/suggestions:

General suggestions:

timothyas commented 4 weeks ago

Thanks for these @sneakers-the-rat!

Minor suggestions

I agreed with all of the changes you suggested, and made those in the text. These suggestions definitely cleared up a few things, thanks! Just to answer a few questions (even though I think you got the gist of what was going on):

L66-68: list is a bit awkward, maybe something like "map them across available resources, from a laptop to a distributed HPC cluster." - if i'm reading your purpose there correctly re: emphasizing how dask lets you scale from small compute to big compute?

That's correct. I just added "cloud" to your suggestion in order to indicate that it can work in the cloud too.

L79: unclear to me if this is just showing another similar package, or if this is the one that you build xesn on top of as mentioned in L62?

Yeah it was just to list another implementation, so I went with your wording suggestion. Thanks!

Medium questions/suggestions:

Screenshot 2024-10-30 at 11 50 22 AM

General suggestions

timothyas commented 4 weeks ago

FYI @sneakers-the-rat the changes I made to the text can be seen in #80. The paper draft workflow is there, so a pdf can be found, e.g. here

sneakers-the-rat commented 4 weeks ago

They are just there to provide the user some guidance on how much memory they might need.

ah! yes that clears things up :) if they're just to give a sense of what parameters influence memory usage and to give me an estimate, that's very helpful! I might even suggest moving that to the main text of the paper or even to the docs - it's rare to get something like that for a package, for it to let me know this accurately how many resources it requires!

I changed this figure so that the measured values are just dots (CPU) or diamonds (GPU) and the derived lines are solid lines that lay on top.

Getting there! I truly don't mean to be a stickler on this, and you can feel free to disagree, but now the different plotting styles of left and right panel make it harder to parse for me - dot shapes are in general less salient markers of difference than color or line style, but the difference between GPU and CPU is arguably the most important contrast. Asking the question to myself "what is it that the gray/black estimate lines are doing for the figure," and I think they don't add a lot to the figure per se as much as I think they would be helpful as a heuristic for potential users. Their semantic content in the figure I think is to demonstrate that the empirical heuristic formula for memory use matches the measured samples, but since the parameters are empirical (rather than derived from something like {dtype_scalar_size}*{array_shape}*... where showing close match between theory and measurement is informative, I think they might serve you better outside of the figure entirely. If you take them of the figure and move them to the main text, then I think you might simplify the figure, caption, and also give a clearer indication to the reader that this formula might be useful to them rather than them thinking of it as a result that they are to read but not necessarily use :) (it is a result, but hopefully my meaning is clear).

I modified the dash frequency

the legend on the left is definitely clearer now! thanks! I am neutral on the dot shapes, up to you whether you like them or not :).

It's a fairly commonly used term, but let me know if you'd like more description there

If this is a term of art that you would expect your readers to know, that's good enough for me. (I'm also one of these newfangled anti-ivory-tower types who thinks it's not only fine but a favor to your reader to shortcut them to wikipedia articles for basic concepts they might need a refresher on, but that's definitely not for everyone so i only mention it as an aside, the added reference is great and i'm sure will be a meaningful indicator of what you mean to your readership!)

The problem size is most definitely fixed. ... "more or less" was referring to the fact that the way dask interacts with the problem can be less intuitive

I know this tension well. I think taking it out lets me see through to what you were saying more easily without getting distracted.

Unfortunately I don't have enough GCP allocation right now to afford additional tests, even though you're right, this would show off the code more.

that is totally fine. So fine that if "ran out of compute credits" was in a methods section as a limitation i wouldn't bat an eye. As a curiosity question, have you profiled to see if my hunch was right? that once the code starts running it finishes extremely quickly, and most of that time is spent in cuda compilation? or is there some other reason it's flat?


rest of the text looks great!

timothyas commented 4 weeks ago

Hi @sneakers-the-rat thanks for all of the comments and feedback! Here's my reply for the last remaining questions/issues.

Screenshot 2024-10-31 at 11 59 38 AM

If this is a term of art that you would expect your readers to know, that's good enough for me.

As a curiosity question, have you profiled to see if my hunch was right? that once the code starts running it finishes extremely quickly, and most of that time is spent in cuda compilation? or is there some other reason it's flat?

sneakers-the-rat commented 3 weeks ago

I think that having the black and gray lines in the plot gives the user an intuitive understanding for how well they can expect the equations to map to their application.

Fair enough! It is your figure :). I like the swap of the colors/shapes, that makes more sense to me. Also putting the dots on top of the lines helps a lot.

What about using solid/dashed lines instead of black/gray? They're visually distinct enough that it wouldnt get confused with the solid/dotted on the left. Black/gray is just a bit hard to tell apart.

Also since the gray lines follow the CPU, and the black follows GPU, what do you think about replacing the legend labels for those lines with some readable shorthand for the equation like "calculated CPU memory" (or whatever you think is accurate, just an example) and then putting the equation in the caption? The equation in the legend already needs someone to read the caption to know what the terms are, so might as well give the eqs a name so people get a semantic handle for them?

Edit: one more thing - it's obvious if you think about it, but it may be worth saying in the caption that the dots/triangles overlap on the right plot but there are still two curves there. On the left the dashed line makes it clear there are two GPU samples, but on the right it just looks like one when I am zoomed out to full paper view. Up to you, I think it could be inferred from the legend, im just a fan of captions that anticipate visual confusion.

I could make the two plots similar by deriving comparable curves for walltime, but I expect this to be less machine independent than memory.

Agreed. This would probably be too variable to be useful.

I went in and added a more intuitive description, and please note that my previous edit included a citation to Amdahl's law

This looks great. I saw and went and did a bit of reading yesterday, the power of a good reference ❤️.

timothyas commented 3 weeks ago

Hi @sneakers-the-rat, thank you for all of these comments!

What about using solid/dashed lines instead of black/gray? They're visually distinct enough that it wouldnt get confused with the solid/dotted on the left. Black/gray is just a bit hard to tell apart.

I appreciate the suggestion, but I will keep the lines solid, since the difference between dashed and solid lines indicates the difference between the two system sizes.

Also since the gray lines follow the CPU, and the black follows GPU, what do you think about replacing the legend labels for those lines with some readable shorthand for the equation like "calculated CPU memory" (or whatever you think is accurate, just an example) and then putting the equation in the caption?

Thanks for the suggestion! I would prefer to keep it how it is, with the equation clearly stated in the legend. I did however modify the following phrase in the caption to be more specific about which line goes to CPU and which to GPU:

"The gray and black lines indicate the general trend in memory usage for the CPU and GPU simulations, respectively.

Also, I added subtitles to the legend so that it's clear which line goes with CPUs and which goes with GPUs.

it may be worth saying in the caption that the dots/triangles overlap on the right plot but there are still two curves there

Thanks, I added a white markeredgeborder to the plot to make this more apparent.


Here is the latest figure and caption. I hope we have converged, but let me know if you have additional comments.

Screenshot 2024-11-01 at 9 37 03 AM

sneakers-the-rat commented 3 weeks ago

great, looks good to me :)

timothyas commented 3 weeks ago

Thanks @sneakers-the-rat!