canimus / cuallee

Possibly the fastest DataFrame-agnostic quality check library in town.
https://canimus.github.io/cuallee/
Apache License 2.0
174 stars 20 forks source link

[JOSS REVIEW] Run performance benchmark on CI pipeline #251

Closed devarops closed 5 months ago

devarops commented 5 months ago

We were able to reproduce the performance benchmark manually. 🎉 However, seeing these results in the CI pipeline would be ideal. 🤖

Please add a job to the CI workflow on GitHub Actions to reproduce the performance benchmark.


This is the last issue of the review:

canimus commented 5 months ago

Hi @devarops thanks for taking the time to evaluate the performance benchmark for cuallee. I personally would like to contest this request, in the frame of our submission for JOSS, for 3 main reasons:

  1. After evaluating 1000 repositories submitted in JOSS I don't see this as a requirement or reported as part of the Github Actions, let alone just presenting test results
  2. The pipeline as it stands now, already consumes 10 minutes of Github runners time, making every test release already a time consuming procedure
  3. In order to evaluate the distinct frameworks, we will require to issue tokens of the Soda and GE providers. That is one of the main drawbacks for testing in BigQuery today and also the renewal of the Freemium Snowflake account.

I will be happy to coordinate a session in which we can reproduce it once again, and I commit to add more reproducible steps in the test performance folder, but I consider that it shouldn't be a blocker to proceed with our desired review for its publication.

Can we therefore close this issue, and work on any other outstanding review item, you consider relevant? Thanks in advance and truly appreciate your time. Best Regards, Herminio

devarops commented 5 months ago

Hi @canimus,

Thank you for your detailed response. I appreciate the clarification regarding the performance evaluation.

The performance evaluation you provided is indeed manually reproducible with the Dockerfile and shell script included in the repository. As such, it is not a blocker issue. My suggestion was merely for consideration to enhance completeness.

Given your explanation, I am happy to recommend the publication of the paper in its current state.

Thank you for your efforts and dedication.

Cheers, Evaristo