gbif / metabarcoding-data-toolkit-ui

Frontend for the eDNA tool
5 stars 1 forks source link

Use relative abundance to calculate Bray-Curtis index #104

Closed tobiasgf closed 6 months ago

thomasstjerne commented 6 months ago

Isn´t this sufficient?

Screenshot at May 13 14-17-27
tobiasgf commented 6 months ago

That is probably fine. The question I got was actually about whether the BC dissimilarities were calculated on relative abundances. I believe we are calculating the BC based on the raw read counts, right? I may be best practice to convert to relative abundances (in sample) first. Would that be feasible?

thomasstjerne commented 6 months ago

It looks like we do this: https://github.com/gbif/edna-tool-ui/issues/2#issuecomment-1717637957 i.e.

taking the fourth root of each value (x^0.25) is a quick and acceptable solution.

tobiasgf commented 6 months ago

That is to downweight the influence of high numbers. But to make sample-to-sample comparisons fair, the sampling effort (read counts per sample) need to be similar, that can be done by resampling (to even depth) or scaling. I suggest to do the last, by dividing the read counts by sample total read count (=relative abundances)