single-cell-data / TileDB-SOMA

Python and R SOMA APIs using TileDB’s cloud-native format. Ideal for single-cell data at any scale.
https://tiledbsoma.readthedocs.io
MIT License
90 stars 25 forks source link

[python] Add `VFS` binding in `pybind11` #2882

Closed nguyenv closed 4 days ago

nguyenv commented 2 months ago

Issue and/or context:

This PR is separated out from https://github.com/single-cell-data/TileDB-SOMA/pull/2752

Changes:

Note for Reviewers https://github.com/single-cell-data/TileDB-SOMA/pull/2882#issuecomment-2286697745

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.36%. Comparing base (7381508) to head (15a8854). Report is 4 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2882 +/- ## ========================================== + Coverage 83.22% 83.36% +0.13% ========================================== Files 51 51 Lines 5462 5463 +1 ========================================== + Hits 4546 4554 +8 + Misses 916 909 -7 ``` | [Flag](https://app.codecov.io/gh/single-cell-data/TileDB-SOMA/pull/2882/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data) | Coverage Δ | | |---|---|---| | [python](https://app.codecov.io/gh/single-cell-data/TileDB-SOMA/pull/2882/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data) | `83.36% <100.00%> (+0.13%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data#carryforward-flags-in-the-pull-request-comment) to find out more. | [Components](https://app.codecov.io/gh/single-cell-data/TileDB-SOMA/pull/2882/components?src=pr&el=components&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data) | Coverage Δ | | |---|---|---| | [python_api](https://app.codecov.io/gh/single-cell-data/TileDB-SOMA/pull/2882/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data) | `83.36% <100.00%> (+0.13%)` | :arrow_up: | | [libtiledbsoma](https://app.codecov.io/gh/single-cell-data/TileDB-SOMA/pull/2882/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=single-cell-data) | `∅ <ø> (∅)` | |
nguyenv commented 2 months ago

We cannot merge this PR into main separately and need to wait for https://github.com/single-cell-data/TileDB-SOMA/pull/2752. clib.VFS binds tiledb::VFS. Importing tiledb-py brings in tiledb.VFS which also binds tiledb::VFS. They can clobber each other, resulting in this error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledbsoma/__init__.py", line 146, in <module>
    from ._collection import Collection
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledbsoma/_collection.py", line 34, in <module>
    from . import _funcs, _tdb_handles
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledbsoma/_tdb_handles.py", line 39, in <module>
    from .options._soma_tiledb_context import SOMATileDBContext
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledbsoma/options/__init__.py", line 1, in <module>
    from ._soma_tiledb_context import SOMATileDBContext
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledbsoma/options/_soma_tiledb_context.py", line 19, in <module>
    import tiledb
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledb/__init__.py", line 22, in <module>
    from .array_schema import ArraySchema
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tiledb/array_schema.py", line 8, in <module>
    import tiledb.cc as lt
ImportError: generic_type: type "VFS" is already registered!

2752 completely removes import tiledb, so it will no longer be a problem.

nguyenv commented 5 days ago

@johnkerl As mentioned above, I was running into a namespace clash error where tiledb::VFS is bound by both the tiledb-py and tiledbsoma-py libraries. Since we will still have some remaining tiledb-py in the code until everything is deprecated in the next major release, I figured out this workaround to get tiledbsoma-py to work where we bind against SOMAVFS, a thin wrapper around tiledb::VFS, rather than tiledb::VFS directly. Let me know if this works as a temporary solution to merge this PR in.

// TODO This temporary workaround prevents namespace clash with tiledb-py.
// Bind tiledb::VFS directly once tiledb-py dependency is removed
class SOMAVFS : public tiledb::VFS {
   public:
    using tiledb::VFS::VFS;
};