Lab-CoMBINE / PoreMeth2

DMRs detection, annotation and analysis
5 stars 0 forks source link

Error When Running PoreMeth2DMR Function for DMR Identification Using Entropy Files #2

Closed priyankagaikwad2208 closed 3 weeks ago

priyankagaikwad2208 commented 1 month ago

Hello martabaragli

I am encountering an error while running the PoreMeth2DMR function for DMR identification using entropy files for control and test samples. The error message indicates that there are NA, NaN, or Inf values in the data passed to the BidimensionalSLMSegIn function.

Input Data

I generated the entropy files from the Modkit extract output after running the ModkitResorter.sh script. Below are example entries from the output files:

Modkit extract output: 940b8a91-2edc-4eac-9360-3ac049f775bd 61 200027 NC_004354.4 + + + 52 54 264 0 m 29 .ATCCA C C true 0 940b8a91-2edc-4eac-9360-3ac049f775bd 62 200028 NC_004354.4 + + + 52 54 264 0 m 29 .TCCAT C C true 0 940b8a91-2edc-4eac-9360-3ac049f775bd 68 200034 NC_004354.4 + + + 52 54 264 0.24023438 m 21 . TTCCA C C false 0 940b8a91-2edc-4eac-9360-3ac049f775bd 69 200035 NC_004354.4 + + + 52 54 264 0.16992188 m 21 . TCCAG C C false 0 940b8a91-2edc-4eac-9360-3ac049f775bd 73 200039 NC_004354.4 + + + 52 54 264 0.36914063 m 16 . GACCT C C false 0 940b8a91-2edc-4eac-9360-3ac049f775bd 74 200040 NC_004354.4 + + + 52 54 264 0.35351563 m 17 . ACCTA C C false 0

Test Entropy File:

NC_024511.2 33 0 278 0 286 NC_024511.2 153 0.0191173476052605 511 0 529 NC_024511.2 193 0 3 0 553 NC_024511.2 194 0.0552839723667078 454 0.00835073068893528 479 NC_024511.2 204 0 537 0 549 NC_024511.2 205 0.00641083333019476 548 0.00177935943060498 562 NC_024511.2 286 0.0128200623601759 548 0.00177935943060498 562 NC_024511.2 332 0 5 0 5

Control Entropy File: (Similar structure to the test file, not shown for brevity.)

I executed the following R code to read the entropy files and run the DMR identification:

Load entropy files

TableTest <- fread("Test.entropy.file.tsv") TableControl <- fread("Control.entropy.file.tsv")

Execute the command for DMR identification

TableDMR <- PoreMeth2DMR(TableTest, TableControl, omega = 0.1, eta = 1e-5, FW = 3)

Error Message

Error in BidimensionalSLMSegIn(rbind(MDSData), muk, mi, smu, sepsilon, : NA/NaN/Inf in foreign function call (arg 7)

Troubleshooting Steps

I checked both TableTest and TableControl for NA, NaN, or Inf values.None of these checks returned TRUE, indicating that there are no invalid values in the datasets.

Could you please provide insights on the expected structure of TableTest and TableControl objects? Any additional guidance on resolving this error would be greatly appreciated.

Thank you!

martabaragli commented 1 month ago

Hello priyankagaikwad2208,

would you be able to tell me whether both test and control files contain the same chromosome names or some chromosome are only reported in one of them? If this is not the case, could you please send us your entropy files so that we can look into your problem?

Thank you, Marta

priyankagaikwad2208 commented 1 month ago

Hello martabaragli, Kindly find the attached files. control_1_barcode08_5mC .csv test_2_barcode09_5mC.csv

martabaragli commented 1 month ago

Hello priyankagaikwad2208,

We tried running PoreMeth2 on your tables and found that the distribution of entropy on your samples (used by the segmentation algorithm) has mean=0 and and standard deviation=0, therefore our model is not able to calculate DMRs.

We are adding a function in the code that checks these values and reports this situation when it occurs, however this should not happen unless the positions provided to PoreMeth2 have extremely low methylation levels (which is the case for the MT contig).

However, if the files you shared are not the complete entropy files and you only shared one chromosome, the issue might have a different origin. If so, could you kindly share the complete files?

Thank you