Pick the column in var DF to mask by in tools.find_named_genes

ukleiner commented 1 year ago

Allow finding vars (genes) by columns and not only by var.index (var_names)

orenbenkiki commented 1 year ago

Nice. Some points though:

MC2 naming conventions is to use descriptive names for parameters. col isn't. It may be "color", it may be "column", it may be whatever. It should be named "name_property" so there's no ambiguity on what it means.
In general we do not talk about "columns". We talk about "per-variable (gene) annotations". These are only "columns" in some places in some of the time, but they are always per-variable annotations.
Likewise var_names_col most definitely ism't a "column". It is just an array. Calling it just var_names is clearer and more correct.
In the package code we set get_v_numpy(adata, "whatever") instead of adata.var["whatever"]. This may seem like a distinction without a difference - it isn't (in some cases) because the returned data type is different (simple numpy.ndarray as opposed to a pandas.Series). These two types are only "mostly compatible" but they are subtly (and confusinnly) different.
Finally, you need to make this conflict-free with the head revision.

These are minor points though - fix them and I'll merge this to be released in the coming-really-soon-I-promise v0.9.

Thanks!

orenbenkiki commented 1 year ago

Github still complains about a conflict - specifically in named.py - it won't let me resolve the conflicts, it seems it insists the PR author will do so.

ukleiner commented 1 year ago

I'm working on resolving the issue. I'll post once I think I'm done

ukleiner commented 1 year ago

Looks like HEAD is missing pipeline/clean.py I had to add it for the analyze_clean_genes function, I couldn't find it in any other file in HEAD. Please let me know if everything is fine. Thank you for the very clear guidelines in this pull request

orenbenkiki commented 1 year ago

Hmmm - I think I renamed this file a while back when as part of the "grand rename". There were several structural changes at the time. Since this channge is small, perhaps your best bet is to delete this PR, clone the repository as it is now, and apply the changes to it (creating a new PR). This would probably be less of a hassle than doing the merge here (git, what can we do... if the world was using Darcs or something like it, it wouldn't have been a problem...). Sorry about the hassle.

ukleiner commented 1 year ago

Done. I think there are no conflicts now. https://github.com/tanaylab/metacells/pull/48

tanaylab / metacells

Pick the column in var DF to mask by in tools.find_named_genes #47