Nanostring-Biostats / InSituType

An R package for performing cell typing in SMI and other single cell data
Other
26 stars 10 forks source link

key speed improvement: make the lldist() call faster #150

Closed patrickjdanaher closed 2 years ago

patrickjdanaher commented 2 years ago

The vast majority of computation time is spent calling lldist() as follows:

logliks <- apply(means, 2, function(x) {
    lldist(x = x, mat = counts, bg = bg, size = size)
  })

This code is calculating the log-likelihood of every value in the counts matrix under each column (profile) in "means".

Questions:

dnadave commented 2 years ago

Is there a way to bring lldist into the apply function? What is happening in this code is two function calls and arguments are passed by copy in R. So, you are making two copies of your arguments. If the lldist function was defined inside your apply, you would get rid of one of those copies which should speed things up some.

On Wed, May 25, 2022 at 11:40 AM Patrick Danaher @.***> wrote:

The vast majority of computation time is spent calling lldist() as follows:

logliks <- apply(means, 2, function(x) { lldist(x = x, mat = counts, bg = bg, size = size) })

This code is calculating the log-likelihood of every value in the counts matrix under each column (profile) in "means".

Questions:

  • Is there a way to benefit even more from matrix/vector operations?
  • Would this run faster if the "counts" matrix was transposed?

— Reply to this email directly, view it on GitHub https://github.com/Nanostring-Biostats/SMI-cell-typing/issues/150, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJVFGZKF6IV76A5MUVMCBLVLZX3BANCNFSM5W6BQC2Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- David Henderson, Ph.D. 18476 47th Place NE Lake Forest Park, WA 98155 206-794-8552

patrickjdanaher commented 2 years ago

@davidpross took care of this. His fix is in the ADO main branch. Woo hoo!