Closed jkeuskamp closed 1 year ago
Have you got a reproducible example? Neither Jari or I typically work with the sorts of data that would involve phyloseq.
The only change of note is this addition as that's the only thing that changed between 2.6-2 and 2.6-4 from what I can tell, but that looks fine, and the error you quote seems to implicate phyloseq not vegan as the error relates to S4 methods, which we don't use and clearly phyloseq does. Perhaps try without directly accessing the slot in the phyloseq object and instead use an extractor?
when importing the attached RDS as x:
vegan::rarefy(x,2) did work in v2.6-2 but doesn't in v2.6-4.
However, following your suggestion and using
vegan::rarefy(as.data.frame(x),2)
does also work in v2.6-4, which solves the problem for me. mydata.RDS.zip
This looks as the same as problem as reported in stackoverflow for rarecurve. This was due to changes in phyloseq which uses non-standard structures that cannot be handled with as.matrix
. I see this as a problem in phyloseq, and I wish they fix their internal structures so that they can be handled as matrices.
While phyloseq using non-standard structures may be the underlying cause, I have confirmed that it did work in 2.6-2, and stopped working after only updating vegan to updating to 2.6-4.
From: Jari Oksanen @.> Date: Wednesday, 21 June 2023 at 18:36 To: vegandevs/vegan @.> Cc: Keuskamp, J.A. (Joost) @.>, Author @.> Subject: Re: [vegandevs/vegan] rarify breaks between version 2.6-2 and 2.6-4 (Issue #579)
This looks as the same as problem as reported in stackoverflow for rarecurvehttps://stackoverflow.com/questions/74530039/error-vegan-rarecurve-error-in-asx-matrixi-j-drop-false-subscript. This was due to changes in phyloseq which uses non-standard structures that cannot be handled with as.matrix.
— Reply to this email directly, view it on GitHubhttps://github.com/vegandevs/vegan/issues/579#issuecomment-1601297874, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAHKBHDOGGEEEP7SP2CEP7DXMMWILANCNFSM6AAAAAAZOPH3KY. You are receiving this because you authored the thread.Message ID: @.***>
I guess it stopped working after parliamentary elections in Finland, but that is also unrelated to vegan. The only change in the function rarefy
is this:
diff --git a/R/rarefy.R b/R/rarefy.R
index 3c5fdf5..56fcb44 100644
--- a/R/rarefy.R
+++ b/R/rarefy.R
@@ -7,6 +7,9 @@
x <- t(x)
if (!identical(all.equal(x, round(x)), TRUE))
stop("function accepts only integers (counts)")
+ minobs <- min(x[x > 0])
+ if (minobs > 1)
+ warning(gettextf("most observed count data have counts 1, but smallest count is %d", minobs))
minsample <- min(apply(x, MARGIN, sum))
if (missing(sample)) {
stop(
and this is completely unrelated to your case. Something else has changed, and that something else no longer works with vegan.
The error certainly happens with the line
minobs <- min(x[x > 0])
which is why we see the difference between the versions. But there is nothing wrong with this if x
is a matrix and something about a phyloseq object or it's methods for coercing said class to matrix or subsetting it (via [
methods) means your object isn't working correctly. But that is a problem with phyloseq, not vegan. My guess right now is that the phyloseq developers defined methods for [
for their "otu_table"
class and it doesn't understand the vector subsetting variant of [
for matrices.
If the issue is that you just accessed the slot containing the data, fix your code to not do that; you're supposed to use sample_data(foo)
to extract the data from the phyloseq object IIRC.
Either way this isn't a problem in vegan, even if the change we made exposed a previous issue with your/their code.
Thanks for letting us know about this issue @jkeuskamp but I suggest you follow up with the phyloseq devs if the problem persists when you use their approved way of accessing the data from their object.
There seem be two ways of circumventing the new phyloseq feature:
otu_table
to be a matrix. Here base::as.matrix
does not work, but you must forcibly change the class to matrix like suggested in the Stackoverflow discussion linked to above.otu_table
to a standard data frame with base::as.data.frame
like you suggested in your message. After this base::as.matrix
works and then the indexing works in the standard way.Naturally we could add any of these tricks to vegan, but I'm reluctant to do that just to circumvent design deficiencies in non-CRAN packages. For most users x <- as.matrix(as.data.frame(x))
would look strange – in particular when x
was a matrix originally.
@jarioksa I agree that we shouldn't handle this problem for vegan — regardless whether it is a CRAN package or in another package repository. Apart from the extra developer effort required and ongoing maintenance if phyloseq changes again, there could be plenty of other places in vegan where such things fail, and if they fail in vegan, which uses standard R code, then there are surely problems with other R packages.
This should be fixed by the phyloseq developers so that the [
method understands vector indexing.
@gavinsimpson Actually it is not vector indexing: x > 0
for 2-dim matrix produces a logical matrix of same dimensions as x
. It is only dropped to a vector after TRUE
items are returned. My understanding in the Stackoverflow case was that phyloseq does not understand logical indexing (TRUE
, FALSE
), but it was a long ago, and I won't make a post mortem analysis now.
That said, quite a few users of rarefy
& friends use phyloseq data, and the check was added there just to warn these people of common wrong usage of rarefaction methods. It was quite common to have user data where the smallest positive entry was in thousands because they had multiplied their data by huge numbers before getting it to rarefaction, and then these once-occurring OTUs in their arbitrary thousands could never be rarefied out. Another group with wrong data were mislead old-time community ecologists who believed the wrong and dangerous piece of advice that you should remove rare taxa that only occur once or twice – also making the data unsuitable for rarefaction (the task of rarefaction is to remove most of those rare species among some other species). I think the OTU people commit both sins: rare cases are first pruned out as read errors, and then data are multiplied by arbitrary numbers.
I also think that often the OTU tables are transposed (OTUs as rows, observations as columns) and then the results are even less meaningful. There is little we can do against this.
It seems that adding this simple function would solve some of the problems with phyloseq, and at least the problems with rarefaction functions:
as.matrix.otu_table <- function(x, ...) x@.Data
Transposed data are not handled, though.
Physloseq has an internal function veganifyOTU()
which handles the transposition too, but I suspect that also uses further internal functions?
I don't think we should be including in vegan methods for classes that we don't control.
I also think that fixing one problem will inevitably result in further problems; what happens if the unwitting user uses the method you describe on otherwise fine transposed data? I suspect in many cases it would run without error resulting in garbage results that an unwitting user would be unaware of.
While trying to fix issues with phyloseq (or with their users' use of the package) might save use some bug reports or StackOverflow posts, I really don't think we should be responsible for problems (even if infelecities or user error) with other packages.
Fine! Agreed. I didn't know about phyloseq::veganifyOTU
. So the standard answer should be "use phyloseq::veganifyOTU if you want to analyse phyloseq OTU tables in vegan".
They'd need phyloseq:::veganifyOTU
as it isn't exported.
Possible regression found:
when using vegan::rarify (v2.6-4) on an phyloiseq@otu-table object , it returns 'Error in as(x, "matrix")[i, j, drop = FALSE] : (subscript) logical subscript too long', while vegan::rarify (v2.6-2) ran without problems