Infineon / StreamGen

Python framework for generating streams of labeled data.
https://infineon.github.io/StreamGen/
MIT License
8 stars 0 forks source link

Paper comments #6

Open firefly-cpp opened 1 week ago

firefly-cpp commented 1 week ago

Hi @LaurenzBeck,

I am posting some relevant comments you may consider when revising your software paper, which is currently under review in JOSS.

According to https://github.com/pytorch/vision, the author is outlined as "TorchVision maintainers and contributors." You should also also make it consistent in your paper.

You may also devote some sentences to theoretical background to continual learning in the beginning when you start introducing terms. Also, add a couple of significant references, review papers, etc.

Define acronyms on the site when they appear for the first time.

You may also need to include a more detailed description of proposed software architecture. This aspect should be considered since it also helps future generations of users quickly learn, modify, and adopt it. Also, why not present one Python snippet in your paper? See one example of a paper here: https://joss.theoj.org/papers/10.21105/joss.06748.pdf

You should also explicitly mention similar or (competitive) frameworks that are close to your proposed framework, along with the pros and cons (one table?).

This issue is a part of the review process: https://github.com/openjournals/joss-reviews/issues/7206

LaurenzBeck commented 1 week ago

morning Iztok ☕👋

Thank you for also checking the paper contents so thoroghly 🙏

According to https://github.com/pytorch/vision, the author is outlined as "TorchVision maintainers and contributors." You should also also make it consistent in your paper.

I corrected the bib entry of torchvision, Zotero messed this on up...

You may also devote some sentences to theoretical background to continual learning in the beginning when you start introducing terms. Also, add a couple of significant references, review papers, etc.

This one was a tricky one for me, since Continual Learning research was my main motivation of creating the package, but it is really just one application of StreamGen. I feel like I already spend a lot of paper space (briefly) introducing it in the Statement of Need section (lines 44 - 78).

You may also need to include a more detailed description of proposed software architecture.

I quickly went over https://joss.readthedocs.io/en/latest/paper.html#what-should-my-paper-contain to make sure this is not a strict requirement. It would be a nice section in a full length software paper, but with 200 words above the upper recommended bound, I feel like this would stretch this JOSS paper too much. I explained architectural choices and details in a lot of detail in the user guide of the documentation.

Also, why not present one Python snippet in your paper?

Same reason here, the documentation contains many examples, and I wanted to stay within the word count bounds. Unfortunately, a representative and educational code example would probably easily fit 1 - 2 pages :/

You should also explicitly mention similar or (competitive) frameworks that are close to your proposed framework, along with the pros and cons (one table?).

I do mention other similar edavours in the statement of need section:

I also mention how Streamgen extends these ideas (lines 75 - 77).

Given that there are no direct alternatives with concrete criteria to compare and distuinguish StreamGen, I would favour the free text format over a table.


I will leave this issue open for you to respond and for @hoanganhngo610 to also use it for his comments ;)

best wishes, Laurenz

hoanganhngo610 commented 1 week ago

Thank you so much @firefly-cpp for your first comments into the paper. @LaurenzBeck, my first comment would also be regarding the references, since there are several papers in your bibiography cited as a preprint from arXiV, while in fact they aren't anymore. These include:

LaurenzBeck commented 1 week ago

Thank you a lot @hoanganhngo610 for checking the citations 🙏

Did you check them manually? Can you recommend me a workflow/tool on how to do it more efficiently? (I mostly have pre-print references in my database 😅) I quickly checked, Zotero is not able to do it automatically.

Anyways, I updated the 4 references you mentioned.

hoanganhngo610 commented 4 days ago

@LaurenzBeck Thank you so much for your response, and for the prompt action to modify the references. In fact, I have to check all of them manually, and I don't really think there would be any workflow/tool that can do it efficiently. Whenever I have to write papers, I have to double-check to make sure that the cited preprints actually did not appear somewhere else.

I will continue my review for the paper and/or discussion on the review of @firefly-cpp within the next comments.

hoanganhngo610 commented 4 days ago

My first follow-up comment based on the suggestion from @firefly-cpp would be:

This one was a tricky one for me, since Continual Learning research was my main motivation of creating the package, but it is really just one application of StreamGen. I feel like I already spend a lot of paper space (briefly) introducing it in the Statement of Need section (lines 44 - 78).

The main focus of StreamGen would be to create a framework to generate data streams, and it has multiple applications apart from Continual Learning itself. However, the author can add 1-2 more sentences about the application and constrains of CL within the first paragraph (lines 44-47), for example each sample is only used once, the number of tasks to be learned are not pre-defined, or the accomodation of new information should not occur without catastrophic forgetting and inteference. Only mentioning that CL "... extends the paradigm to work in dynamic (...) enviornments ..." and "... acquire and preserve knowledge continually" seems not enough.

LaurenzBeck commented 3 days ago

I guess I am already to deep into the topic and asuming too much. Thank you for the clarification, I will add and re-write some parts today 😊

LaurenzBeck commented 3 days ago

Done: https://github.com/Infineon/StreamGen/actions/runs/11379909229

I iterated on some explanations and managed to keep the current page layout 😁