chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
37 stars 23 forks source link

Greatly relax dependencies for CLI #677

Closed Zethson closed 11 months ago

Zethson commented 12 months ago

Motivation

The package is currently really hard to install in environments with other packages. The pins are too strict and use old versions that are no longer supported.

See: https://github.com/chanzuckerberg/single-cell-curation/blob/b680cc775c5bae53c73a1a64ee21a659dea102ec/cellxgene_schema_cli/requirements.txt

Definition of Done

Please unpin as many packages as possible and reasonable. Please also ensure that the package still works with the newer version. There may be issues with Pandas.

Tasks

Detail the specific tasks that can be used to accomplish the desired changes. If detailed steps cannot be provided at this time, please file a Tech Proposal instead.

brianraymor commented 11 months ago

We generally recommend the use of venv or pipx. The CLI is primarily for use by our curators and our ingestion pipeline which accounts for the strictness. For example we currently only accept anndata 0.8.0 datasets per the schema.

Can you tell me more about your use of the CLI?

Zethson commented 11 months ago

We subclassed your Validator and implemented some of our own checks.

brianraymor commented 11 months ago

implemented some of our own checks.

Can you share more details?

Zethson commented 11 months ago

It's a private repository. Nevertheless, we just wanted a quick and dirty Validator that checks whether specific columns are present and for those columns we only wanted to allow specific terms.

It's nothing complex.

brianraymor commented 11 months ago

I'm glad that you were able to reuse the code.

Zethson commented 11 months ago

So you're not planning to relax the dependencies?

brianraymor commented 11 months ago

That is correct per my comments above.