HCBravoLab / metagenomeSeq

Statistical analysis for sparse high-throughput sequencing
64 stars 20 forks source link

Error in intTime[i, 2] = timePoints[tail(abundanceDifference, n = 1)] : replacement has length zero #52

Open MarRey opened 7 years ago

MarRey commented 7 years ago

Hello,

The error in the title, namely:

Error in intTime[i, 2] = timePoints[tail(abundanceDifference, n = 1)] : replacement has length zero

ensued whilst (test)running the following code:

fTS <- lapply(rownames(fData(MGS_f)), function(i){ fitTimeSeries(obj=MGS_f, feature=i, class="birthmode", id="ID", time="Timepoint_days", C=0, B=1) })

with MGS_f being my filtered newMRexperiment and the rownames(fData(MGS_f) my taxa/features. Strangely, the above code runs without an error when changing the class to another variable, such as gender. Both the birthmode and gender variables don't contain any NAs and have the same structure (integer).

I tried narrowing down the problem, and it seems the function won't work due to a certain row/taxon/feature. I've looked into this feature, thinking it might not be present at certain timepoints, and indeed, it is only present in 1 sample at 1 specific sample, so I thought this might be the cause of the error, because a comparison between the two classes can't be made. However, when running the same analysis for the variable gender, there is no issue.

What could be the cause of the error? I'd appreciate any insight into my problem! Thank you very much in advance.

jeffkimbrel commented 4 years ago

I am having the same issue, but found I can successfully run my version of your above code if I add log = FALSE. I'm not sure why that helps... all of my counts are positive integers, or 0... so perhaps the log(0) giving -Inf call somewhere in the function is giving an error?

hcorrada commented 4 years ago

Thank you for writing. Can you provide code for a minimal example causing the issue?

jeffkimbrel commented 4 years ago

Attached is a zip file with a full dataset with the problem, and a reduced dataset that removes the problem. This isn't how I intended to do this, but making a minimal dataset with just the offending taxonomic class removed the issue, but appears to have added another. These are both objects made using phyloseq_to_metagenomeSeq().

ms_fts.zip

Full dataset

I knew from iterating over the classes (using the code in the vignette and the original comment on this issue) that the code died running against the Opitutae class. Running twice below, changing only the log = TRUE or log = FALSE.

> res.full = fitTimeSeries(ms.full, lvl = "class", feature='Opitutae', class='HOST', time='DATE', id='SAMPLE', C=0.3, B=10, log=TRUE, norm=TRUE)
Error in intTime[i, 2] <- timePoints[tail(abundanceDifference, n = 1)] : 
  replacement has length zero
> res.full = fitTimeSeries(ms.full, lvl = "class", feature='Opitutae', class='HOST', time='DATE', id='SAMPLE', C=0.3, B=10, log=FALSE, norm=TRUE)
> res.full$timeIntervals
     Interval start Interval end     Area    p.value
[1,]              0           40 18446.81 0.09090909

Reduced dataset

In phyloseq, I made a reduced dataset by subsetting out just the Verrucomicrobia phylum, which opitutae is a class of. In order to run phyloseq_to_metagenomeSeq() I had to also remove some samples that had zero counts after subsetting which dropped it from 54 to 45 samples.

Here, running with either option for log gives no errors, but it appears that no permutations are being run with log = TRUE.

> res.ver.TRUE = fitTimeSeries(ms.ver, lvl = "class", feature='Opitutae', class='HOST', time='DATE', id='SAMPLE', C=0.3, B=10, log=TRUE, norm=TRUE)
> res.ver.FALSE = fitTimeSeries(ms.ver, lvl = "class", feature='Opitutae', class='HOST', time='DATE', id='SAMPLE', C=0.3, B=10, log=FALSE, norm=TRUE)
> length(res.ver.TRUE$perm)
[1] 0
> length(res.ver.FALSE$perm)
[1] 10
> res.ver.TRUE$timeIntervals
[1] "No statistically significant time intervals detected"
> res.ver.FALSE$timeIntervals
     Interval start Interval end     Area    p.value
[1,]             37           49 324176.8 0.09090909

Let me know if you need more information or examples, or if the files I have provided don't give enough to reproduce.

> packageVersion("metagenomeSeq")
[1] ‘1.28.2’
> packageVersion("phyloseq")
[1] ‘1.30.0’
> packageVersion("gss")
[1] ‘2.1.12’