lcmmichielsen / scHPL

MIT License
34 stars 1 forks source link

Error when learning celltype hierarchy #15

Closed Xicici-Yan closed 5 months ago

Xicici-Yan commented 5 months ago

Hi, I found it impossible to use scHPL to conduct a new tree without batch information with function 'scHPL.learn.learn_tree'. How could I learn cell-type hierarchy of reference datasets without batch info, and apply the model to another dataset? Thanks

nictru commented 5 months ago

Hey, I recently did a similar thing. As you correctly observed, in the paper, you have different datasets (which are treated as batches), each with cell-type annotations. The tree is then created based on the relationships between the annotations across batches.

If you only have a single dataset with a single annotation layer, the function will give you a flat structure. However, I had multiple annotation layers, so I picked a random layer for each cell and treated each layer as a dataset. This might not be how the package should be used, but it worked fine.

lcmmichielsen commented 5 months ago

What Nico suggests is indeed an option.

Another option would be to construct the tree yourself and only train it (instead of learning it). You can see an example in this notebook. The original authors of the Human Lung Cell Atlas already had different annotation levels, so I used that to create a hierarchy and train it (Part 1 of the notebook). In part 2, we update that trained tree with a healthy dataset.

Xicici-Yan commented 4 months ago

Hey, I recently did a similar thing. As you correctly observed, in the paper, you have different datasets (which are treated as batches), each with cell-type annotations. The tree is then created based on the relationships between the annotations across batches.

If you only have a single dataset with a single annotation layer, the function will give you a flat structure. However, I had multiple annotation layers, so I picked a random layer for each cell and treated each layer as a dataset. This might not be how the package should be used, but it worked fine.

Thanks,I conducted a similar way to construct a tree myself. It does help!