laminlabs / lamindb

A data framework for biology.
https://docs.lamin.ai
Apache License 2.0
127 stars 10 forks source link

Search is much better on the UI than in the open-source package #1708

Open falexwolf opened 4 months ago

falexwolf commented 4 months ago
image
falexwolf commented 2 months ago

Search on the UI is good: https://lamin.ai/laminlabs/cellxgene/bionty/celltype

The question is whether we want to somehow wire the "good search endpoint" into lamindb in case somebody runs on a deployed instance. 🤔

Koncopd commented 2 months ago

And why is it so?

Zethson commented 2 months ago

Also see https://github.com/laminlabs/bionty/issues/98#issuecomment-2317618942

Koncopd commented 1 month ago

When i created a collection with name="check mapped", i got this results image I assume they come from search. Names are

'Massively multiplex chemical transcriptomics at single-cell resolution'
'Mapping single-cell transcriptomes in the intra-tumoral and associated territories of kidney cancer'
'A spatially resolved atlas of the human lung characterizes a gland-associated immune niche'

And these look completely irrelevant to me.

falexwolf commented 2 weeks ago

The rapid fuzz search on the public ontology seems to be reasonable here except for the two last automated labels:

{
      'Dendritic cells': 'dendritic cell',
      'CD19+ B': 'B cell, CD19-positive',
      'CD4+/CD45RO+ Memory': 'effector memory CD4-positive, alpha-beta T cell, terminally differentiated',
      'CD8+ Cytotoxic T': 'cytotoxic T cell',
      'CD4+/CD25 T Reg': 'CD8-positive, CD25-positive, alpha-beta regulatory T cell',
      'CD14+ Monocytes': 'CD14-positive, CD16-negative classical monocyte',
      'CD56+ NK': 'CD16-positive, CD56-dim natural killer cell, human',
      'CD8+/CD45RA+ Naive Cytotoxic': 'CD38-positive naive B cell',
      'CD34+': 'CD38-high pre-BCR positive cell'
}

Source https://docs.lamin.ai/scrna2

It's also a good test case.