openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
703 stars 36 forks source link

[REVIEW]: CategoricalTimeSeries.jl: A toolbox for categorical time-series analysis #3733

Closed whedon closed 2 years ago

whedon commented 2 years ago

Submitting author: @johncwok (Corentin Nelias) Repository: https://github.com/johncwok/CategoricalTimeSeries.jl Version: v1.1.4 Editor: @jbytecode Reviewer: @bkamins, @felixcremer Archive: 10.6084/m9.figshare.16917625

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/919a3be5c11e2add7274a8517066361e"><img src="https://joss.theoj.org/papers/919a3be5c11e2add7274a8517066361e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/919a3be5c11e2add7274a8517066361e/status.svg)](https://joss.theoj.org/papers/919a3be5c11e2add7274a8517066361e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@bkamins & @felixcremer, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jbytecode know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @bkamins

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @felixcremer

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 2 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @bkamins, @felixcremer it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 2 years ago

Wordcount for paper.md is 784

whedon commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1137/141000671 is OK
- 10.1093/biomet/80.3.611 is OK
- 10.1002/9781119097013 is OK
- 10.1145/369133.369172 is OK
- 10.1162/neco_a_00961 is OK

MISSING DOIs

- None

INVALID DOIs

- None
whedon commented 2 years ago
Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.10 s (381.9 files/s, 99031.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
SVG                              1              0              0           2671
JavaScript                       6            473           1213           2025
Julia                            8            183             17           1309
HTML                             7            272             18            950
Markdown                         8            134              0            513
CSS                              2             16             67             71
TeX                              1              8              0             70
YAML                             3             11              2             53
TOML                             1              5              0             20
XML                              1              0              0             11
JSON                             1              0              0              1
-------------------------------------------------------------------------------
SUM:                            39           1102           1317           7694
-------------------------------------------------------------------------------

Statistical information for the repository '194e50e4d4a79db82c104c3e' was
gathered on 2021/09/16.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
Corentin Nelias                  1          3711              0          100.00

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
Corentin Nelias            3711          100.0          0.0               32.69
whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jbytecode commented 2 years ago

Dear @bkamins and @felixcremer 👋👋👋

This is the review issue. There are 20 check items for each reviewer. Whenever you finish reviewing the corresponding item, you can set it checked. You can also interact with the author(s), other reviewer(s) and me (editor) during the reviewing process.

Please do not hesitate to ask me anything.

Thank you in advance.

jbytecode commented 2 years ago

Dear @bkamins and @felixcremer, do you need help? Could you please update your status on how is your reviewing going?

bkamins commented 2 years ago

Thank you for pinging. Now I can see that the package got a release 5 days ago that I have requested for earlier. I will go back to the review then.

bkamins commented 2 years ago

I am not fully sure where I should put the comments, so I put them here:

Regarding the analytical side of the package:

Thank you for working on this.

jbytecode commented 2 years ago

@bkamins it is okay to put your comments here in the thread, thank you.

CNelias commented 2 years ago

Dear @bkamins,

I addressed your comments, specifically:

I have one question\need help:

When I try to fix it, another package will throw this error. For example, if I fix it for FFTW, then Plots errors, and if I fix it for Plots then another one bugs and it is a never ending cycle. I search for a solution but couldn't fix this so I just removed all "problematic" compats in order to have a working package. Do you know what causes this and how I can solve it? There are already similar issues on the discourse but they are not resolved or do not solve my issue.

bkamins commented 2 years ago

then the package doesn't pass the test on the travis virtual machine, even though it works perfectly otherwise

The reason is the following. You seem not to have done the package releases properly. For instance IntegerIB.jl package has 0.1 tag only made yesterday. However, in Julia Global Registry, here https://github.com/JuliaRegistries/General/blob/master/I/IntegerIB/Versions.toml you see that 0.1 tag was added 13 months ago with git-tree-sha1 786b9878a5889e1feec38510e694016c3bb76c47.

This makes Julia Package manager install the IntegerIB.jl package from 13 months ago not from yesterday when you ask for it being installed (you can check it on a CLEAN environment - without any packages installed in a federated package repository - this is exactly what happens on Travis).

Note that Julia does not look at what is the tag in the GitHub repository for the release but it looks at git-tree-sha1 in https://github.com/JuliaRegistries/General to resolve package versions. You can read how new releases of packages should be registered in https://github.com/JuliaRegistries/Registrator.jl

CNelias commented 2 years ago

So re-registering all the packages to their current version should solve this right?

bkamins commented 2 years ago

No - you have to bump the version number and then register. The problem is exactly the fact that you use the same version number for different states of package GitHub repository (Julia will not allow registering the same version number twice as it would be ambiguous).

Also note that this has to be done sequentially. If A depends on B and B on C then you have to:

jbytecode commented 2 years ago

@bkamins just to inform you, you can send a pull request to the repo under review, if needed. all of the contributions are welcome.

bkamins commented 2 years ago

@jbytecode - it probably requires like 5 PRs in different repos and testing if all works correctly sequentially. The repo maintainer OTOH can probably do each operation in 1 minute (as maintainer has full rights to the repo and knows the packages and their interdependencies).

CNelias commented 2 years ago

@bkamins I took care of it, and it is passing CI now. I don't know how long it will take until the re-registration is complete though.

whedon commented 2 years ago

:wave: @bkamins, please update us on how your review is going (this is an automated reminder).

whedon commented 2 years ago

:wave: @felixcremer, please update us on how your review is going (this is an automated reminder).

bkamins commented 2 years ago

I don't know how long it will take until the re-registration is complete though.

I have checked your PRs to General Registry. As you can see your packages (for different reasons) fail the registration process. If you have any questions regarding the registration please ask me and I will try to help.

Otherwise - could you please let me know when the registration is finished - thank you! (in particular at least one of the packages is missing license and having proper licensing is also a JOSS requirement)

CNelias commented 2 years ago

Do you know why the auto-merge from MotifRecognition.jl failed? The license is there, and the error message is totally incomprehensible to me. I have added the missing license to SpectralEnvelope.jl.

bkamins commented 2 years ago

Do you know why the auto-merge from MotifRecognition.jl failed?

As you can see in https://github.com/JuliaRegistries/General/pull/45831#issuecomment-931248106 there are two problems:

CNelias commented 2 years ago

I have added the [compat] for DataStructures.jl.

I guess I will just let you know when the re-registration is complete then.

jbytecode commented 2 years ago

@johncwok are the package registration stuff okay?

if the compat issue is still current, you can have a look at the page https://pkgdocs.julialang.org/v1/compatibility/. In this page the sample in caret section

[compat]
PkgA = "^1.2.3" # [1.2.3, 2.0.0)
PkgB = "^1.2"   # [1.2.0, 2.0.0)
PkgC = "^1"     # [1.0.0, 2.0.0)
PkgD = "^0.2.3" # [0.2.3, 0.3.0)
PkgE = "^0.0.3" # [0.0.3, 0.0.4)
PkgF = "^0.0"   # [0.0.0, 0.1.0)
PkgG = "^0"     # [0.0.0, 1.0.0)

is very informative for your situation I think (if you don't want to specify a rigid version, you can also give a range - caret is a kind of range). On the other hand, Julia Registrator requires an incremental (by +1) version number and you cannot step forward at one time.

In compat, when we define

julia = "^1"

it is for 1.5.2, 1.5.3, 1.6, 1.6.1, 1.7 and so on so forth (<2).

you can also use the tilde specifier:

[compat]
PkgA = "~1.2.3" # [1.2.3, 1.3.0)
PkgB = "~1.2"   # [1.2.0, 1.3.0)
PkgC = "~1"     # [1.0.0, 2.0.0)
PkgD = "~0.2.3" # [0.2.3, 0.3.0)
PkgE = "~0.0.3" # [0.0.3, 0.0.4)
PkgF = "~0.0"   # [0.0.0, 0.1.0)
PkgG = "~0"     # [0.0.0, 1.0.0)
CNelias commented 2 years ago

Thanks! I actually already fixed the compat, but as bkamins stated earlier, this was not the only reason why the automatic registration was rejected by the bot. It is because my version numbering was not done according to the convention. Do you know how to trigger the registration bot again? I don't want to open another pull request for that and create extra work for the people dealing with the registrations by hand.

jbytecode commented 2 years ago

how about using

@ JuliaRegistrator register

in the registration issue as stated in https://github.com/JuliaRegistries/Registrator.jl or using the JuliaHub?

bkamins commented 2 years ago

Yes - re-triggering JuliaRegistrator should work.

CNelias commented 2 years ago

I get the following message from the registrator: Error while trying to register: "Tag with name v1.0.1 already exists and points to a different commit I made increments of the version when I fixed various bugs but did not register them, and now I can't auto merge because I cannot use an earlier version number that would meet the guidelines.

bkamins commented 2 years ago

Have you tried removing the tag? Also which exact repository you are referring to? (and could you please share the link to the commit at which you have triggered the registrator). Thank you!

CNelias commented 2 years ago

This is the link: https://github.com/johncwok/CategoricalTimeSeries.jl/commit/dbf270a8acfd43fa32cf531795d8358a0bc47cc4 I'll try to remove the tag (I am a bit unsure how) and see if it works.

bkamins commented 2 years ago

You have made 1.0.1 tag in your repository 6 days ago, while you are trying to register a package with version 1.0.1 on a commit made 5 hours ago.

@DilumAluthge - what would you recommend @johncwok to help him clean up the versions of his packages so that he can cleanly register them in an easiest way? Thank you!

DilumAluthge commented 2 years ago

Couple different options:

  1. Delete the tag.
  2. Register a new version. If AutoMerge complains that you have skipped a version, you can ask for a manual merge in the #pkg-registration channel on the Julia Slack.
CNelias commented 2 years ago

@bkamins The registering is done!

bkamins commented 2 years ago

I would recommend to add release notes for the latest releases (this is a minor thing).

Also I would recommend an improvement of the examples.

For instance I take an example:

using DelimitedFiles
using SerialDependence
using Plots
#reading 'pewee' time-series test folder.
series = readdlm("test\\pewee.txt",',')[1,:]
lags = collect(1:25)
v = cohen_coefficient(series, lags)
t, b = bootstrap_CI(series, cramer_coefficient, lags)
a = plot(lags, v, xlabel = "Lags", ylabel = "K", label = "Cramer's k")
plot!(a, lags, t, color = "red", label = "Limits of 95% CI"); plot!(a, lags, b, color = "red", label = "")

and have the following issues:

  1. I was not sure where the pewee.txt file is stored. You also use Windows path delimiter. I would recommend rather something like: joinpath(dirname(pathof(CategoricalTimeSeries)), "test", "pewee.txt")
  2. The SerialDependence package is not found as the user was not instructed earlier to install it.
  3. Then the function bootstrap_CI throws an error as "not found" (the reason is probably that SerialDependence.jl still does not have 0.1.1 release registered)

Could you please re-check that all examples work on fresh Julia installation after all your packages get registered? Thank you!

CNelias commented 2 years ago

I think most of the issues that you are describing are due to the fact that the merge with general registry has not happened yet. I had no Windows path delimiter anymore in the current repo. I implemented your suggestion nonetheless. Concerning bootstrap_CI, it is working without problems on my machine with the recently registered version. Do you know how often the general registry is updated?

I have one question regarding your second point (the user was not instructed earlier to install it): Do you mean that this should be stated in the documentation or is there a Julia friendly way to queue the user to do it when installing CategoricalTimeSeries?

jbytecode commented 2 years ago

@johncwok I think the examples should be minimal and should be complete enough to reveal up everything in a single shot so if it is possible please use hard-coded data instead external files in examples.

bkamins commented 2 years ago

external files in examples

These files are shipped along with the package installation, so I think it is OK to use them.

bkamins commented 2 years ago

Do you mean that this should be stated in the documentation

It should be stated in the documentation. If you do using SerialDependence you have to install SerialDependence.jl earlier. So either do not do using SerialDependence or instruct the user that it should be installed. I would prefer that using CategoricalTimeSeries should be enough and it should re-export all functionalities provided by dependencies, but it is of course up to you how you design it. The crucial thing is that it should be possible to just copy-paste the code from the documentation and it should work.

CNelias commented 2 years ago

Understood! I updated the doc to only include importations of CategoricalTimeSeries. I run all the examples from the doc via copy-paste, everything is fine. I think you can carry on with the review now.

PS: The doc is back online (adding a 'readthedoc.yml' was enough).

bkamins commented 2 years ago

you have dirname twice in joinpath(dirname(dirname(pathof(CategoricalTimeSeries))), "test", "pewee.txt")

bkamins commented 2 years ago

Your bounds for versions of Plots.jl in CategoricalTimeSeries.jl are incorrect. They disallow installing latest Plots.jl and also disallow installing latest SerialDepencence.jl version (which is required for things to work AFAICT).

Can you please share Project.toml and Manifest.toml that you are using under which all works, as AFAICT, it is impossible to install the version configuration of the packages that you say that you have installed.

CNelias commented 2 years ago

you have dirname twice in joinpath(dirname(dirname(pathof(CategoricalTimeSeries))), "test", "pewee.txt")

Yes, otherwise it links to the directory src, but I need the parent directory in order to access test.

CNelias commented 2 years ago

Your bounds for versions of Plots.jl in CategoricalTimeSeries.jl are incorrect. They disallow installing latest Plots.jl and also disallow installing latest SerialDepencence.jl version (which is required for things to work AFAICT).

Can you please share Project.toml and Manifest.toml that you are using under which all works, as AFAICT, it is impossible to install the version configuration of the packages that you say that you have installed.

I was using the old version since I thought that the new one hadn't been merged yet. I changed the Plots.jl compat entry, now the package can be installed from the url without error.

bkamins commented 2 years ago

Yes, otherwise it links to the directory src, but I need the parent directory in order to access test.

Right. I have forgotten to add you need to add ".." in joinpath. Alternatively what you propose is also OK.

bkamins commented 2 years ago

I changed the Plots.jl compat entry, now the package can be installed from the url without error.

You need to make a new release of CategoricalTimeSeries.jl for the changes to be visible to the users.

bkamins commented 2 years ago

The following definitions:

function init_values(x, y, algorithm = "IB")
    pxy = get_pxy(x, y)
    px = get_px(x, y)
    qt_x_init = get_qt_x_init(px, algorithm)
    qt_init = qt_x_init*px
    qy_t_init = pxy'*qt_x_init'.*(1 ./ qt_init)'
    return (qt_x_init, qt_init, qy_t_init)
end

function init_values(pxy, algorithm = "IB")
    px = get_px(pxy)
    qt_x_init = get_qt_x_init(px, algorithm)
    qt_init = qt_x_init*px
    qy_t_init = pxy'*qt_x_init'.*(1 ./ qt_init)'
    return (qt_x_init, qt_init, qy_t_init)
end

The second overwrites the first thus making problem with incremental compilation.

bkamins commented 2 years ago

The way you do include("Thresholds.jl") in https://github.com/johncwok/CategoricalTimeSeries.jl/blob/6aee66b2d9aeaee10ee7a0702307520c1e9f4903/src/MotifRecognition.jl#L4 and https://github.com/johncwok/CategoricalTimeSeries.jl/blob/82c951371011b4847bb461ee02ba227d95b6b399/src/CollisionMatrix.jl#L5 causes incremental precompilation problems. There are more issues like this.

jbytecode commented 2 years ago

@bkamins I suggest a full coverage of unit tests rather than testing only the main functionality for this situation.How about that?

CNelias commented 2 years ago

The following definitions:

function init_values(x, y, algorithm = "IB")
    pxy = get_pxy(x, y)
    px = get_px(x, y)
    qt_x_init = get_qt_x_init(px, algorithm)
    qt_init = qt_x_init*px
    qy_t_init = pxy'*qt_x_init'.*(1 ./ qt_init)'
    return (qt_x_init, qt_init, qy_t_init)
end

function init_values(pxy, algorithm = "IB")
    px = get_px(pxy)
    qt_x_init = get_qt_x_init(px, algorithm)
    qt_init = qt_x_init*px
    qy_t_init = pxy'*qt_x_init'.*(1 ./ qt_init)'
    return (qt_x_init, qt_init, qy_t_init)
end

The second overwrites the first thus making problem with incremental compilation.

How can I change that? I don't want to have two functions with different names when they are doing the same thing. Having the function call depend on the argument makes the code easier to understand (in my opinion).

bkamins commented 2 years ago

The easiest thing to do is to change algorithm into a keyword argument. Then all will be clean. Such a change is breaking so you would need to go for 0.2 release then.