openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: h5preserve #581

Closed whedon closed 6 years ago

whedon commented 6 years ago

Submitting author: @aragilar (James Tocknell) Repository: https://github.com/h5preserve/h5preserve Version: v0.15.0 Editor: @arfon Reviewer: @mattpitkin Archive: 10.5281/zenodo.1179292

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea"><img src="http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea/status.svg)](http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@mattpitkin, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @arfon know.

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 6 years ago

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @mattpitkin it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
whedon commented 6 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 6 years ago

--> Check article proof :page_facing_up: <--

arfon commented 6 years ago

@mattpitkin - please carry out your review in this issue by updating the checklist above and giving feedback in this issue. The reviewer guidelines are available here: http://joss.theoj.org/about#reviewer_guidelines

Any questions/concerns please let me know.

mattpitkin commented 6 years ago

Hi @aragilar - could you include explicit instructions for installation in the README and documentation? E.g., something like:

install the software using:

pip install h5preserve

or by cloning this repository and running:

python setup.py install --user

Could you also add a requirements file?

mattpitkin commented 6 years ago

Having downloaded h5preserve using pip, I'm trying the example shown here, and in both Python 2.7.12 and 3.5.2 I get the following error:

In [6]: my_cool_experiment = Experiment(np.array([1,2,3,4,5]), 10)

In [7]: with open("my_data_file.hdf5", new_registry_list(registry)) as f:
   ...:     f["cool experiment"] = my_cool_experiment
   ...: 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-f3958ebfc497> in <module>()
----> 1 with open("my_data_file.hdf5", new_registry_list(registry)) as f:
      2     f["cool experiment"] = my_cool_experiment

TypeError: file() argument 2 must be string, not RegistryContainer

If I edit it to use with open("my_data_file.hdf5", "w") as f: I get the following error:

In [10]: with open("my_data_file.hdf5", "w") as f:
    ...:     f["cool experiment"] = my_cool_experiment
    ...: 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-63e87c258186> in <module>()
      1 with open("my_data_file.hdf5", "w") as f:
----> 2     f["cool experiment"] = my_cool_experiment

TypeError: 'file' object does not support item assignment

Do you know what the problem could be?

aragilar commented 6 years ago

@mattpitkin That error due to me not checking that the code runs, it should be h5open not open (I've updated the docs), that example also used an older version of the api, where dictionaries could be used (I dropped that as there were a bunch of problems with it).

aragilar commented 6 years ago

I've also added more explicit instructions and referred to https://packaging.python.org/ as needed (I was avoiding explicit instructions, as I've noticed far too many projects refer to old methods e.g. python setup.py install, eazy_install).

I haven't added a requirements.txt file for the dependencies of h5preserve as h5preserve is a library, not an application, so it shouldn't have a requirements.txt as per https://caremad.io/posts/2013/07/setup-vs-requirement/. If you meant for the tests, tox handles all that.

mattpitkin commented 6 years ago

Thanks for these additions. Now that you've removed explicit Python 2 support could you update the version on PyPI, so that the badge on the README page no longer lists Python 2.7?

mattpitkin commented 6 years ago

Regarding the automated testing, I'm not really familiar with tox, but what does it mean when the output says

================================ 94 passed, 43 xfailed in 0.78 seconds ================================

Is it ok/expected that 43 xfail?

mattpitkin commented 6 years ago

Just a minor typo thing - the links to the repo in the "Contributing" doc page point to github.org rather than github.com, so they don't work.

aragilar commented 6 years ago

Fixed the urls, and added a test to the CI to make sure they don't break, and uploaded a new release to PyPI (so everything should show python 3 only).

The xfail come from pytest, it means the test is expected to fail (and hence "passes"), if they passed then there'd be a problem (I think most of those xfailing are stubs for unit tests I haven't had time to implement yet, most of the code is covered by integration tests).

mattpitkin commented 6 years ago

I just have a couple more question/requests?

Is it generally the case that you would define the registry.loader and registry.dumper functions outside of the class they're referring to, or could/should you have them as methods of the class just called, e.g., dump and load? The latter seems to make more sense to me, but maybe I'm thinking of these methods in the wrong way.

Would it be possible to give explicit examples of creating your own RegistryContainer, and using the GroupContainer, as only the DatasetContainer is used in the current example? Also, would you be able to shown an example of versioning in action, where you create two different versions, as I'm not entirely sure what it means at the moment?

mattpitkin commented 6 years ago

A couple of minor typos to fix in the paper.md file:

mattpitkin commented 6 years ago

Can you add the doi = {10.1109/MCSE.2011.37} to the van_der_walt_numpy_2011 reference in the paper.bib file?

aragilar commented 6 years ago

@mattpitkin I've added the paper changes, and I've added more examples, do they make sense to you?

Whilst there's nothing stopping you from making the dumpers/loaders static methods on the class, there a few reasons not to:

  1. Sometimes it's hard to add methods to classes (or where it's inconvenient to): e.g. namedtuples, or other people's classes
  2. Subclasses will use the same method for loading/dumping, unless you override it: this is in effect the same problem that pickle has (it has options for controlling dumping via a bunch of methods: https://docs.python.org/3/library/pickle.html#pickle-inst), you lose control over what you dump/load (which is why you can choose not to use the built-in dumpers/loaders).
  3. Versioning becomes much harder: you either end up with n different functions, which is the same as the existing API, or you have a whole bunch of if statements, which you can screw up.

I'll try to translate that into an advice section, but once you try to implement the system as suggested in https://www.youtube.com/watch?v=7KnfGDajDQw, then you do come to the conclusion that separate functions just works better (though it took me two previous iterations of the code to learn this, it's definitely not obvious).

mattpitkin commented 6 years ago

@aragilar - thanks, the examples look good to me, so I'm happy to sign things off.

@arfon - I'm happy for this to be accepted.

arfon commented 6 years ago

@aragilar - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? I can then move forward with accepting the submission.

aragilar commented 6 years ago

Zenodo doi: https://doi.org/10.5281/zenodo.1179292 Zenodo record: https://zenodo.org/record/1179292

arfon commented 6 years ago

@whedon set 10.5281/zenodo.1179292 as archive

whedon commented 6 years ago

OK. 10.5281/zenodo.1179292 is the archive.

arfon commented 6 years ago

@mattpitkin many thanks for your review here ✨

@aragilar - your paper is now accepted into JOSS and your doi is https://doi.org/10.21105/joss.00581 :zap: :rocket: :boom:

whedon commented 6 years ago

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippet:

[![DOI](http://joss.theoj.org/papers/10.21105/joss.00581/status.svg)](https://doi.org/10.21105/joss.00581)

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following: