koyelucd / betaHMM

betaHMM for differential analysis
1 stars 0 forks source link

Missing value error in betaHMM when using random beta values #1

Open hubert-pop opened 5 months ago

hubert-pop commented 5 months ago

Hi,

Thanks for developing such a promising package !

I just started using it and I'm having an issue with the betaHMM function. When using methylation data other that those provided with the package, it seems I cannot pass some tests and I finally get an error. Here below is an example.

# Packages import ####

library(dplyr)
library(tidyr)
library(betaHMM)

# Testing betaHMM() with different datasets ####

# annotation data
data(sample_annotation_file)
head_annotation_file <- head(sample_annotation_file, n=10) # 10 rows

# methylation data
data(sample_methylation_file)
head_methylation_file <- head(sample_methylation_file, n=10) # 10 rows
head_methylation_file <- head_methylation_file %>% select(IlmnID, 
                                                               Benign_Patient_1, Benign_Patient_2, 
                                                               Tumour_Patient_1, Tumour_Patient_2) # 5 cols

# betaHMMout -> ok
betaHMM_out <- betaHMM(head_methylation_file,
                       head_annotation_file,
                       M = 3,
                       N = 2,
                       R = 2,
                       parallel_process = FALSE,
                       seed = 321,
                       treatment_group = c("Benign","Tumour"))

No problem with the above code. Things are however different if I provide random beta-values in the methylation data file, as follows for example.

# new methylation data
head_methylation_file$Benign_Patient_1 <- c(0.8, 1, 0.33, 0.80, 0, 0.875, 0.875, 0.94, 0.937, 0.85)
head_methylation_file$Benign_Patient_2 <- c(1, 1, 0.83, 0.83, 0, 0.875, 0.815, 1, 0.95, 0.92)
head_methylation_file$Tumour_Patient_1 <- c(1, 1, 1, 1, 0, 1, 0.929, 1, 0.952, 0.958)
head_methylation_file$Tumour_Patient_2 <- c(1, 1, 1, 1, 0, 1, 1, 1, 0.955, 0.967)

# betaHMMout -> error
betaHMM_out <- betaHMM(head_methylation_file,
                       head_annotation_file,
                       M = 3,
                       N = 2,
                       R = 2,
                       parallel_process = FALSE,
                       seed = 321,
                       treatment_group = c("Benign","Tumour"))

Here I get the following issue.

Error in betaHMM_workflow(sorted_data[sorted_data$CHR == chr_unique[chr],  : 
  task 1 failed - "missing value where true/false needed"

Any idea of what I'm missing here to properly use the package ?

Thanks in advance for any insight !

koyelucd commented 5 months ago

Hello,

Thank you for bringing this issue to our attention. It seems the problem might be due to using 0 or 1 in your test cases, as the method employs logarithmic values, which can cause errors. We have added a check for this issue and have added some more changes to the method, which will be included in our next release.

We apologize for any inconvenience this may have caused. We are working to upload the corrections and changes to GitHub as soon as possible, so users can try installing the new package directly from there.

In the meantime, please try using other values and let me know if you continue to experience any issues.

Kind regards, Koyel

hubert-pop commented 5 months ago

Hi,

Thanks Koyel for such a prompt reply and explanation. Perhaps there is something else here as even after replacing 0s and 1s with other values I keep having the issue.

For example, the methylation data as provided below do not avoid it.

# New methylation data v2 (0 --> 0.02 and 1 --> 0.98)
head_methylation_file$Benign_Patient_1 <- c(0.8, 0.98, 0.33, 0.80, 0.02, 0.875, 0.875, 0.94, 0.937, 0.85)
head_methylation_file$Benign_Patient_2 <- c(0.98, 0.98, 0.83, 0.83, 0.02, 0.875, 0.815, 0.98, 0.95, 0.92)
head_methylation_file$Tumour_Patient_1 <- c(0.98, 0.98, 0.98, 0.98, 0.02, 0.98, 0.929, 0.98, 0.952, 0.958)
head_methylation_file$Tumour_Patient_2 <- c(0.98, 0.98, 0.98, 0.98, 0.02, 0.98, 0.98, 0.98, 0.955, 0.967)

So far I can't see what's causing the problem, but maybe further testing will help.

Thanks again for your valuable help.