Understanding `neighbors_within_batch` parameter?

Teichlab / bbknn

Batch balanced KNN

MIT License

149 stars 25 forks source link

Thanks for the kind words, sorry for the slightly delayed reply - I need to start regularly checking the email tied to my GitHub again.

BBKNN performs a KNN search for each batch individually, and then merges the resulting neighbour lists together. This parameter is the k for that search, for each batch. The value of 3 stems from the fact that when computing the KNN for the batch a particular cell is from, the returned KNN will include the cell itself as one of the KNN regardless of the neighbour identification algorithm. As such, having fewer than two neighbours within a batch feels excessive. The value can be adjusted if desired, but is kept low as it tends to lead to better correction (as you noticed) while also improving run time.

Teichlab / bbknn

Understanding `neighbors_within_batch` parameter? #19