single-cell-data / TileDB-SOMA

Python and R SOMA APIs using TileDB’s cloud-native format. Ideal for single-cell data at any scale.
https://tiledbsoma.readthedocs.io
MIT License
84 stars 25 forks source link

[r] Fix dependency of a test on fixed buffer sizes #2685

Closed johnkerl closed 3 months ago

johnkerl commented 3 months ago

Addresses #2684 .

johnkerl commented 3 months ago

Temporarily I added commit #2 https://github.com/single-cell-data/TileDB-SOMA/pull/2685/commits/432db0100fc9b59669dad22b4f86a9103fb4456a which suppresses the env-var approach to buffer sizes, forcing us to deal with them correctly in the particular apis/r/tests/testthat/test-SOMAArrayReader-Iterated.R which (currently) depends on a particular buffer size.

This allows me to imitate the nightly-build failure seen by @jdblischak at https://github.com/single-cell-data/TileDB-SOMA/blob/1.11.4/apis/r/vignettes/soma-experiment-queries.Rmd#L84.

This is temporary -- once soma.init_buffer_bytes is wired up correctly, for apis/r/tests/testthat/test-SOMAArrayReader-Iterated.R, then we can restory the CI YAML changes.

jdblischak commented 3 months ago

I kicked off a nightly run that uses kerl/flex-buffer-sizes for TileDB-SOMA

https://github.com/jdblischak/centralized-tiledb-nightlies/actions/runs/9390901834

johnkerl commented 3 months ago

@eddelbuettel as discussed in https://github.com/single-cell-data/TileDB-SOMA/pull/2681 I had on #531 attempted this before and ran into issues with various-sized runners.

The TILEDB_SOMA_INIT_BUFFER_BYTES lines on the current CI YAMLs are for robustness with respect to various GHA runners.

They are not related to the bugfix done on this PR.

This PR fixed a bug wherein two cases would pass only with fixed buffer sizes. That is fixed.

What remains, and what needs to remain, is "low ceiling" for all tests, all languages (Python and R), Linux, MacOS, and Windows.

eddelbuettel commented 3 months ago

I believe we are talking about two different issues but I have to attend other matters rather than debate this here.

eddelbuettel commented 3 months ago

The TILEDB_SOMA_INIT_BUFFER_BYTES lines on the current CI YAMLs are for robustness with respect to various GHA runners.

They are not related to the bugfix done on this PR.

By keeping them as env vars you ensure the values are picked up. That neuters any and all CI on this very PR. As the values are already picked 'for certain' from the TileDB Core API under this, we cannot actually ascertain that this PR does as it claims.

johnkerl commented 3 months ago

@eddelbuettel thank you for your feedback.

As noted I don't want to break future CI including Windows as linked on #531.

And we do have the nightlies as on https://github.com/jdblischak/centralized-tiledb-nightlies/issues/11#issuecomment-2150323719 for extra fallback.

The env vars here must be retained with some small values.

But I could bump them from say 16MB in the env vars to 32MB, to ensure bug-fix regression checking on apis/r/tests/testthat/test-SOMAArrayReader-Iterated.R and maintaining the "low ceiling" needed for Windows CI.

eddelbuettel commented 3 months ago

Conceptually speaking, now that have a handle on setting size as needed, should we not run with the defaults at CI (because hey, CI is there to check our defaults) and the override in test files as needed as we did here?

(I am aware that we are smudging the lines in other places by suppressing some CI tests on some platforms, or under valgrind, or skip some, ... )

johnkerl commented 3 months ago

Thank you again for your feedback and dialogue.

johnkerl commented 3 months ago

This is a good idea to think about.

For Windows CI as noted above: the ceiling was so low I didn't want to futz every single test -- I felt the CI env var approach was far simpler.

If it's one OOM test -- fine, soma.init_buffer_bytes. If it's many -- env var is better.

I don't have a record of how many Windows tests were failing.

I'll flag this for later follow-up.

Right now we've unbroken the nightlies which is most pressing. Windows-CI issues can wait for follow-up.

eddelbuettel commented 3 months ago

That's basically what I wanted to get at with 'remove the CI value' because it will now "blind us" from possible errors outside of CI (i.e. real users and clients...) as we are forcing the old small size back in so we likely will experience fewer failtures. Not great so maybe worth a try in a branch.

The main thing is that the regression is fixed, and Paul and I had some good chats about configs. All that may need a revisit too.

johnkerl commented 3 months ago

I will file a follow-up task.

There is other more pressing tech debt than this particular task.

Thank you for your dialogue.

eddelbuettel commented 3 months ago

I maintained for maybe one and a half years that the layout, build system and CI are all tech debt and got some verbal promises that 'after the release' (i.e 1.5.0) this would get addressed. It's now nearly a year later and nothing changed. But that's the way it goes with promises: some remain unfulfilled.

johnkerl commented 3 months ago

Getting rid of TileDB-Py and TileDB-R remains tech debt number one. Glad we're getting close on those.

eddelbuettel commented 3 months ago

Well we will see about that. All the config calls we used today make use of tiledb-r config and ctx objects.