openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
714 stars 38 forks source link

[REVIEW]: NEMSEER: A Python package for downloading and handling historical National Electricity Market forecast data produced by the Australian Energy Market Operator #5883

Closed editorialbot closed 10 months ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@prakaa<!--end-author-handle-- (Abhijith Prakash) Repository: https://github.com/UNSW-CEEM/NEMSEER Branch with paper.md (empty if default branch): joss-paper Version: v1.0.7 Editor: !--editor-->@arfon<!--end-editor-- Reviewers: @mfleschutz, @amandadsmith Archive: 10.5281/zenodo.10162614

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/5ba29c4a5cfec8721c876587699ab719"><img src="https://joss.theoj.org/papers/5ba29c4a5cfec8721c876587699ab719/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/5ba29c4a5cfec8721c876587699ab719/status.svg)](https://joss.theoj.org/papers/5ba29c4a5cfec8721c876587699ab719)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@mfleschutz & @amandadsmith, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @arfon know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

πŸ“ Checklist for @mfleschutz

πŸ“ Checklist for @amandadsmith

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.37 s (174.1 files/s, 27668.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          17            495            856           2945
Jupyter Notebook                 3              0           1857           1310
Markdown                        17            377              0           1006
HTML                            19              0              0            722
YAML                             4             30             53            178
TeX                              1             18              0            163
TOML                             1             14             20             72
DOS Batch                        1              8              1             26
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                            64            946           2794           6431
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago

Wordcount for paper.md is 968

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10/gcj9vc is OK
- 10.5334/jors.148 is OK
- 10.5281/zenodo.7697295 is OK
- 10.5281/zenodo.7467074 is OK
- 10.1016/j.joule.2022.01.004 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.5281/zenodo.3509134 is OK

MISSING DOIs

- 10.4337/9781788979955.00017 may be a valid DOI for title: The Evolution of the European Model for Electricity Markets

INVALID DOIs

- None
arfon commented 1 year ago

@mfleschutz, @amandadsmith This is the review thread for the paper. All of our communications will happen here from now on.

Please read the "Reviewer instructions & questions" in the first comment above. Please create your checklist typing:

@editorialbot generate my checklist

As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/5883 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mfleschutz commented 1 year ago

Review checklist for @mfleschutz

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

amandadsmith commented 1 year ago

Review checklist for @amandadsmith

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

mfleschutz commented 11 months ago

So I am starting my proper review.

First thing is noticed when reading through the documentation:

In https://nemseer.readthedocs.io/en/latest/examples/visualising_p5min_demand_forecasts.html there are a lot of annoying warnings:

WARNING: findfont: Generic family 'sans-serif' not found because none of the following families were found: Source Sans 3, Helvetica, Computer Modern Sans Serif, Avant Garde, sans-serif

@prakaa, can you get rid of them?

mfleschutz commented 11 months ago
mfleschutz commented 11 months ago
mfleschutz commented 11 months ago

I use a MacBook Pro M1 to test. When running the p5min_demand_forecast_error_2021.ipynb the download_raw_data works. However, calculating the forecast_error (in the cell containing the line forecast_error = pd.concat(results, axis=0)) throws the following error multiple times, so I stopped the calculation:

multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File ".../python3.9/multiprocessing/pool.py", line 114, in worker
    task = get()
  File ".../python3.9/multiprocessing/queues.py", line 368, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'calculate_p5min_demand_forecast_error_simpler' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File ".../python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File ".../python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
prakaa commented 11 months ago

@editorialbot generate pdf

editorialbot commented 11 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

prakaa commented 11 months ago

Hi @mfleschutz,

Thanks very much for taking the time to review the package and the paper. I've made changes based on your comments:


So I am starting my proper review.

First thing is noticed when reading through the documentation:

In https://nemseer.readthedocs.io/en/latest/examples/visualising_p5min_demand_forecasts.html there are a lot of annoying warnings:

WARNING: findfont: Generic family 'sans-serif' not found because none of the following families were found: Source Sans 3, Helvetica, Computer Modern Sans Serif, Avant Garde, sans-serif

@prakaa, can you get rid of them?

These should no longer be in that example. Sorry about that.


  • [ ] @prakaa, some of the References in the JOSS paper are no books but have no URL or DOI.

Proof (as in comment above) should now have URLs/DOIs for all references


  • [ ] In the paper, you do not mention, if there is a similar package out there. This would help to stress the statement of need.

    • [ ] If there are similar packages, please describe how this software compares to other commonly-used packages(State of the field).

I have added to the paragraph that follows how NEMSEER solves the issues with accessing AEMO forecast data. That paragraph now reads:

Though existing software solutions (e.g. NemSight, ez2view and NEOpoint) can provide access to some of the same data, most lack a programmatic interface useful for deeper analysis and all are proprietary commercial software. Furthermore, NEMSEER adds significant value to users interested in deeper analysis through its documentation. It contains examples showing how users can analyse demand forecast errors and energy price convergence using pre-dispatch demand and price forecast data (obtained using NEMSEER) and historical actual NEM system and market data (obtained using NEMOSIS) (Gorman et al., 2018). Figure 1 is an output of one such example.


I use a MacBook Pro M1 to test. When running the p5min_demand_forecast_error_2021.ipynb the download_raw_data works. However, calculating the forecast_error (in the cell containing the line forecast_error = pd.concat(results, axis=0)) throws the following error multiple times, so I stopped the calculation:

multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File ".../python3.9/multiprocessing/pool.py", line 114, in worker
    task = get()
  File ".../python3.9/multiprocessing/queues.py", line 368, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'calculate_p5min_demand_forecast_error_simpler' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File ".../python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File ".../python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)

I had originally provided two methods for calculating demand forecast errors. This error is related to the method that uses xarray, pandas and multiprocessing. However, this is a method that many users are unlikely to use to calculate errors across a year (it takes many hours to complete). So I've removed that method, and now only have one method. I have tested this and it works for me. Could you please run the new Jupyter notebook? Note that it write HTML plots to file

arfon commented 11 months ago

:wave: @amandadsmith – how are you getting on with your review?

amandadsmith commented 11 months ago

@arfon Thank you for the reminder. So sorry to the authors! I dropped the ball on this one but will work on it in the next 2 days

amandadsmith commented 11 months ago

Regarding Reproducibility: The review checklist asks that "If the paper contains original results, results are entirely reproducible by reviewers." It looks like NEMOSIS would be required to reproduce Fig. 1 in the paper. I'm not interested in downloading another package and reproducing it, but not sure whether it's appropriate to check this one off.

EDITED to add: It seems that the examples in readthedocs also rely on NEMOSIS. If it's expected that the two are always used together, I would mention this in the installation instructions or put a link to NEMOSIS readme on the NEMSEER readme. (If not, can you provide examples demonstrating NEMSEER's functionality alone?)

amandadsmith commented 11 months ago

Regarding Community guidelines: I'm not sure if there is any guidance provided for "third parties wishing to ... 3) Seek support"

amandadsmith commented 11 months ago

Regarding References: I think the names got mixed up in this one: Hoyer, S., & Joseph, H. (2017). Xarray: N-D labeled arrays and datasets in python. Journal of Open Research Software, 5(1). https://doi.org/10.5334/jors.148

amandadsmith commented 11 months ago

Regarding documentation: The quick start guide says that "AEMO’s MMS Data Model reports describe tables and columns that are available via nemseer." However, the link doesn't directly show information related to the tables--each package in the left sidebar seems to have a list of tables, but it's a lot to browse. If I wanted to know more about e.g. 'CASESOLUTION' would I need to know the package that contains it?

amandadsmith commented 11 months ago

Also might be nice to explain here that the time period is considered as month ending, and will download a file for the entire month corresponding to a given time stamp.

e.g. I queried: nemseer.download_raw_data(forecast_type="STPASA", tables="CASESOLUTION", raw_cache = raw_cache, forecasted_start="2010/01/01 00:00", forecasted_end="2010/01/31 11:30", keep_csv=True)

...and I expected to get one parquet file, one csv file, but got two of each and it just took me a minute to realize why.

amandadsmith commented 11 months ago

@prakaa @arfon I've done my review. Seems like this package could save other researchers a lot of time! My remaining items are minor--please see comments above and notify when resolved, and I will get back to the checklist ASAP. Thanks for your patience and sorry again for the delay on my end.

mfleschutz commented 11 months ago

I also finished my review. NEMSEER is a useful and well-documented piece of software with a clear contribution. @prakaa thanks for your work. @arfon thanks for overseeing this review process.

arfon commented 11 months ago

Thanks @amandadsmith and @mfleschutz. Seems like there are a few (fairly minor) areas of feedback to address here @prakaa – please let me know when you've managed to incorporate the reviewer feedback.

prakaa commented 10 months ago

@editorialbot generate pdf

editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

prakaa commented 10 months ago

Addressing @amandadsmith's comments here (thanks very much for your time on this review and for your comments!). If this is fine with @amandadsmith, I have incorporated all of the reviewers' feedback @arfon.


Regarding Reproducibility: The review checklist asks that "If the paper contains original results, results are entirely reproducible by reviewers." It looks like NEMOSIS would be required to reproduce Fig. 1 in the paper. I'm not interested in downloading another package and reproducing it, but not sure whether it's appropriate to check this one off.

EDITED to add: It seems that the examples in readthedocs also rely on NEMOSIS. If it's expected that the two are always used together, I would mention this in the installation instructions or put a link to NEMOSIS readme on the NEMSEER readme. (If not, can you provide examples demonstrating NEMSEER's functionality alone?)

I have added a note in the Installation section of the README that explains how to install NEMOSIS


Regarding Community guidelines: I'm not sure if there is any guidance provided for "third parties wishing to ... 3) Seek support"

There is now a Support section in the README


Regarding References: I think the names got mixed up in this one: Hoyer, S., & Joseph, H. (2017). Xarray: N-D labeled arrays and datasets in python. Journal of Open Research Software, 5(1). https://doi.org/10.5334/jors.148

The names should now be correct in the proof above. Thanks for spotting this


Regarding documentation: The quick start guide says that "AEMO’s MMS Data Model reports describe tables and columns that are available via nemseer." However, the link doesn't directly show information related to the tables--each package in the left sidebar seems to have a list of tables, but it's a lot to browse. If I wanted to know more about e.g. 'CASESOLUTION' would I need to know the package that contains it?

Yes you would and I've provided some more guidance on how to locate the table of interest in a new subsection of the quick start guide


Also might be nice to explain here that the time period is considered as month ending, and will download a file for the entire month corresponding to a given time stamp.

e.g. I queried: nemseer.download_raw_data(forecast_type="STPASA", tables="CASESOLUTION", raw_cache = raw_cache, forecasted_start="2010/01/01 00:00", forecasted_end="2010/01/31 11:30", keep_csv=True)

...and I expected to get one parquet file, one csv file, but got two of each and it just took me a minute to realize why.

I'm not quite sure if this is what you meant, but this is because the function call above calculates a run_start and a run_end that are in different months. I sort of explained this in the glossary, but I agree with you that it's not clear from anything in the quick start guide. I've added a footnote to two places in the quick start guide (one is the section you reference) that explains this behaviour


amandadsmith commented 10 months ago

@prakaa Everything I've mentioned is well addressed here. Thank you for clarifying each item and making it so easy to follow up. Congratulations on making such a nice contribution!

@arfon My checklist is complete and all minor items mentioned have been satisfied.

arfon commented 10 months ago

@prakaa – looks like we're very close to being done here. I will circle back here next week, but in the meantime, please give your own paper a final read to check for any potential typos etc.

After that, could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:

arfon commented 10 months ago

@editorialbot recommend-accept

editorialbot commented 10 months ago

Paper is not ready for acceptance yet, the archive is missing

arfon commented 10 months ago

@editorialbot generate pdf

editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

prakaa commented 10 months ago

@prakaa – looks like we're very close to being done here. I will circle back here next week, but in the meantime, please give your own paper a final read to check for any potential typos etc.

After that, could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:

* The title of the archive is the same as the JOSS paper title

* That the authors of the archive are the same as the JOSS paper authors

* I can then move forward with accepting the submission.

Thanks for co-ordinating the review @arfon. DOI is 10.5281/zenodo.10162614

prakaa commented 10 months ago

@arfon is the DOI above OK? Please let me know if I need to do anything else.

arfon commented 10 months ago

@editorialbot set 10.5281/zenodo.10162614 as archive

editorialbot commented 10 months ago

Done! archive is now 10.5281/zenodo.10162614

arfon commented 10 months ago

@arfon is the DOI above OK? Please let me know if I need to do anything else.

Yeah, all good. Thanks for the ping. I'm afraid I got a little behind on my JOSS editorial duties πŸ˜…

arfon commented 10 months ago

@editorialbot set v1.0.7 as archive

editorialbot commented 10 months ago

Done! archive is now v1.0.7

arfon commented 10 months ago

@editorialbot recommend-accept

editorialbot commented 10 months ago
Attempting dry run of processing paper acceptance...
editorialbot commented 10 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1093/oxrep/grx041 is OK
- 10.5334/jors.148 is OK
- 10.5281/zenodo.7697295 is OK
- 10.5281/zenodo.7467074 is OK
- 10.1016/j.joule.2022.01.004 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.5281/zenodo.3509134 is OK
- 10.4337/9781788979955.00017 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 10 months ago

:wave: @openjournals/pe-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/4823, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

arfon commented 10 months ago

@editorialbot accept

editorialbot commented 10 months ago
Doing it live! Attempting automated processing of paper acceptance...
editorialbot commented 10 months ago

Ensure proper citation by uploading a plain text CITATION.cff file to the default branch of your repository.

If using GitHub, a Cite this repository menu will appear in the About section, containing both APA and BibTeX formats. When exported to Zotero using a browser plugin, Zotero will automatically create an entry using the information contained in the .cff file.

You can copy the contents for your CITATION.cff file here:

CITATION.cff

``` cff-version: "1.2.0" authors: - family-names: Prakash given-names: Abhijith orcid: "https://orcid.org/0000-0002-2945-4757" - family-names: Bruce given-names: Anna orcid: "https://orcid.org/0000-0003-1820-4039" - family-names: MacGill given-names: Iain orcid: "https://orcid.org/0000-0002-9587-6835" contact: - family-names: Prakash given-names: Abhijith orcid: "https://orcid.org/0000-0002-2945-4757" doi: v1.0.7 message: If you use this software, please cite our article in the Journal of Open Source Software. preferred-citation: authors: - family-names: Prakash given-names: Abhijith orcid: "https://orcid.org/0000-0002-2945-4757" - family-names: Bruce given-names: Anna orcid: "https://orcid.org/0000-0003-1820-4039" - family-names: MacGill given-names: Iain orcid: "https://orcid.org/0000-0002-9587-6835" date-published: 2023-12-09 doi: 10.21105/joss.05883 issn: 2475-9066 issue: 92 journal: Journal of Open Source Software publisher: name: Open Journals start: 5883 title: "NEMSEER: A Python package for downloading and handling historical National Electricity Market forecast data produced by the Australian Energy Market Operator" type: article url: "https://joss.theoj.org/papers/10.21105/joss.05883" volume: 8 title: "NEMSEER: A Python package for downloading and handling historical National Electricity Market forecast data produced by the Australian Energy Market Operator" ```

If the repository is not hosted on GitHub, a .cff file can still be uploaded to set your preferred citation. Users will be able to manually copy and paste the citation.

Find more information on .cff files here and here.

editorialbot commented 10 months ago

🐘🐘🐘 πŸ‘‰ Toot for this paper πŸ‘ˆ 🐘🐘🐘

editorialbot commented 10 months ago

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/4824
  2. Wait five minutes, then verify that the paper DOI resolves https://doi.org/10.21105/joss.05883
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! πŸŽ‰πŸŒˆπŸ¦„πŸ’ƒπŸ‘»πŸ€˜

Any issues? Notify your editorial technical team...