EcoJulia / Microbiome.jl

For analysis of microbiome and microbial community data
Other
47 stars 10 forks source link

Improve speed of `shannon` and `ginisimpson` #135

Closed barucden closed 2 years ago

barucden commented 2 years ago

Hi. I noticed that some computations could be saved in the definition of shannon and ginisimpson.

I am not a user of this package, so I am not sure about the impact. However, some measurements:

julia> const x = rand(1000);

julia> @btime shannon(x)
  148.720 μs (2 allocations: 15.88 KiB)
6.730219402261953

julia> @btime new_shannon(x)
  11.158 μs (0 allocations: 0 bytes)
6.730219402261952

julia> @btime ginisimpson(x)
  4.278 μs (2 allocations: 15.88 KiB)
0.9986907680545117

julia> @btime new_ginisimpson(x)
  296.874 ns (0 allocations: 0 bytes)
0.9986907680545117

The speed difference should increase with the length of the input.

Be aware that the result of the new shannon is little bit off due to numerical accuracy.

kescobo commented 2 years ago

:thinking: Ack! Sorry for this languishing. I could have sworn I replied to this and merged it already... :scream:

Anyway, looks great - these functions are rarely bottlenecks, but more speed with no cost is obviously great. Thanks for the contribution!