lincc-frameworks / tape

[Deprecated] Package for working with LSST time series data
https://tape.readthedocs.io
MIT License
12 stars 3 forks source link

Reduced shared parquet file contention in testing #440

Closed wilsonbb closed 4 months ago

wilsonbb commented 4 months ago

Change Description

Since moving to dask-espr, we've seen increased test failures. In https://github.com/lincc-frameworks/tape/issues/434 there have been recent failures in test_batch_by_band where a flux column can sometimes be read in as NaNs during the from_parquet call used in that test invocation. Interestingly this only seems to occur when the tests are run as a suite and not when test_batch_by_band is run in isolation.

We can achieve the same effect by having the test build an ensemble using duplicate files of the source and object files. This appears to fix the test when run on github actions, but we should still investigate the underlying issue with from_parquet and then revert this change.

Code Quality

Bug Fix Checklist

github-actions[bot] commented 4 months ago
Before [8f33ee8a] After [1c6dc5a6] Ratio Benchmark (Parameter)
33.0±0.8ms 32.4±0.1ms 0.98 benchmarks.time_prune_sync_workflow
33.2±0.6ms 32.0±0.7ms 0.96 benchmarks.time_batch

Click here to view all benchmarks.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 95.82%. Comparing base (40236a3) to head (5eb4149).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #440 +/- ## ======================================= Coverage 95.82% 95.82% ======================================= Files 25 25 Lines 1772 1772 ======================================= Hits 1698 1698 Misses 74 74 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.