GoekeLab / xpore

Identification of differential RNA modifications from nanopore direct RNA sequencing
https://xpore.readthedocs.io/
MIT License
131 stars 23 forks source link

Diffmod and yml file outcome issue #166

Closed Michael-m6A closed 1 year ago

Michael-m6A commented 1 year ago

Hi,

I have a question about the diffmod process, yml sample structure, and consequent outcome.

I have attempted to compare “infected/Wolbachia cells” to “uninfected/Aag2 cells” condition (condition 1 vs condition 2), however, regardless of how I set up the sample order in the yml config file the diffmod outcome is always presented as “uninfected/Aag2 cells” vs “infected/Wolbachia cells” (condition 2 vs condition 1), and k-mer and DRM values remain nearly identical. Please see yml config files and k-mer comparison below.

yml config file - WB2vsAag2

data: Wolbachia: PNXP22239: /scratch/project_mnt/S0081/dataprep-PNXP22239 PNXP22245: /scratch/project_mnt/S0081/dataprep-PNXP22245 PNXP22247: /scratch/project_mnt/S0081/dataprep-PNXP22247 Aag2-cell: PNXP22242: /scratch/project_mnt/S0081/dataprep-PNXP22242 PNXP22246: /scratch/project_mnt/S0081/dataprep-PNXP22246 PNXP22248: /scratch/project_mnt/S0081/dataprep-PNXP22248

out: /scratch/project_mnt/S0081/diff-mod-WBvsAag2-results

diffmod table

user:~> head /scratch/project_mnt/S0081/diff-mod-vectorbase-WB2vsAag2-results/diffmod.table id,position,kmer,diff_mod_rate_Aag2-cells_vs_Wolbachia-cells,pval_Aag2-cells_vs_Wolbachia-cells,z_score_Aag2-cells_vs_Wolbachia-cells,mod_rate_Wolbachia-cells-PNXP22239,mod_rate_Wolbachia-cells-PNXP22245,mod_rate_Wolbachia-cells-PNXP22247,mod_rate_Aag2-cells-PNXP22242,mod_rate_Aag2-cells-PNXP22246,mod_rate_Aag2-cells-PNXP22248,coverage_Wolbachia-cells-PNXP22239,coverage_Wolbachia-cells-PNXP22245,coverage_Wolbachia-cells-PNXP22247,coverage_Aag2-cells-PNXP22242,coverage_Aag2-cells-PNXP22246,coverage_Aag2-cells-PNXP22248,mu_unmod,mu_mod,sigma2_unmod,sigma2_mod,conf_mu_unmod,conf_mu_mod,mod_assignment

user:/QRISdata/Q5334> grep AAEL028996-RA xpore-diffmod-table-vectorbase-WB2vsAag2 | grep CCACA

AAEL028996-RA,2798,CCACA,-0.007467517704232068,0.8649337516551788,-0.17009712860092385,0.16025812958580235,0.12944683750201516,0.12718184490274753,0.18307264956706312,0.11703984181124805,0.09437176749955768,161.99999999999994,255.00000000000003,234.00000000000023,49.0,124.00000000000004,76.0,76.84319439223911,70.92655185442675,1.6347343977761477,4.14398080999623,0.508436860422544,0.012923274225890045,lower

yml config file - Aag2vsWB2

data: Aag2-cells: PNXP22242: /scratch/project_mnt/S0081/dataprep-PNXP22242 PNXP22246: /scratch/project_mnt/S0081/dataprep-PNXP22246 PNXP22248: /scratch/project_mnt/S0081/dataprep-PNXP22248 Wolbachia-cells: PNXP22239: /scratch/project_mnt/S0081/dataprep-PNXP22239 PNXP22245: /scratch/project_mnt/S0081/dataprep-PNXP22245 PNXP22247: /scratch/project_mnt/S0081/dataprep-PNXP22247

out: /scratch/project_mnt/S0081/diff-mod-results

diffmod table

user:~> head /scratch/project_mnt/S0081/diff-mod-vectorbase-results/diffmod.table id,position,kmer,diff_mod_rate_Aag2-cells_vs_Wolbachia-cells,pval_Aag2-cells_vs_Wolbachia-cells,z_score_Aag2-cells_vs_Wolbachia-cells,mod_rate_Aag2-cells-PNXP22242,mod_rate_Aag2-cells-PNXP22246,mod_rate_Aag2-cells-PNXP22248,mod_rate_Wolbachia-cells-PNXP22239,mod_rate_Wolbachia-cells-PNXP22245,mod_rate_Wolbachia-cells-PNXP22247,coverage_Aag2-cells-PNXP22242,coverage_Aag2-cells-PNXP22246,coverage_Aag2-cells-PNXP22248,coverage_Wolbachia-cells-PNXP22239,coverage_Wolbachia-cells-PNXP22245,coverage_Wolbachia-cells-PNXP22247,mu_unmod,mu_mod,sigma2_unmod,sigma2_mod,conf_mu_unmod,conf_mu_mod,mod_assignment

user:/QRISdata/Q5334> grep AAEL028996-RA xpore-diffmod-table-vectorbase-Aag2vsWB2 | grep CCACA

AAEL028996-RA,2798,CCACA,-0.007467517704233206,0.8649337516551545,-0.1700971286009548,0.18307264956704777,0.11703984181124052,0.09437176749955156,0.1602581295857945,0.1294468375020065,0.12718184490273848,49.000000000000014,124.00000000000001,76.00000000000001,161.99999999999994,254.99999999999983,234.00000000000014,76.84319439223914,70.9265518544264,1.634734397776206,4.143980809995678,0.5084368604225343,0.012923274225883468,lower

Is there a better way to structure the yml config file and define how conditions are compared to each other?

Many thanks in advance.

Michael

yuukiiwa commented 1 year ago

Hi Michael (tagging you here @Michael-m6A),

The order of the conditions from xpore diffmod is based on alphabetical order, so a little trick we do is to add "a" in front of condition 1 and and "b" in front of condition 2 in the config.yml file, which will look like the following:

data:
b_Aag2-cells:
PNXP22242: /scratch/project_mnt/S0081/dataprep-PNXP22242
PNXP22246: /scratch/project_mnt/S0081/dataprep-PNXP22246
PNXP22248: /scratch/project_mnt/S0081/dataprep-PNXP22248
a_Wolbachia-cells:
PNXP22239: /scratch/project_mnt/S0081/dataprep-PNXP22239
PNXP22245: /scratch/project_mnt/S0081/dataprep-PNXP22245
PNXP22247: /scratch/project_mnt/S0081/dataprep-PNXP22247

out: /scratch/project_mnt/S0081/diff-mod-results

Thanks!

Best wishes, Yuk Kei

Michael-m6A commented 1 year ago

Hi Yuk Kei (@yuukiiwa),

Thank you very much for your answer and clarification.

We’ve attempted a variety of naming options in the meantime and worked out that using X in front of Aag2 achieved the desired order of comparison.

yml config file

data: Wolbachia: WB2-1: /scratch/project_mnt/S0081/dataprep-PNXP22239 WB2-2: /scratch/project_mnt/S0081/dataprep-PNXP22245 WB2-3: /scratch/project_mnt/S0081/dataprep-PNXP22247 XAag2-cell: XAag2-1: /scratch/project_mnt/S0081/dataprep-PNXP22242 XAag2-2: /scratch/project_mnt/S0081/dataprep-PNXP22246 XAag2-3: /scratch/project_mnt/S0081/dataprep-PNXP22248

out: /scratch/project_mnt/S0081/diff-mod-vectorbase-WB2vsXAag2-tables

diffmod table id,position,kmer,diff_mod_rate_Wolbachia_vs_XAag2-cell,pval_Wolbachia_vs_XAag2-cell,z_score_Wolbachia_vs_XAag2-cell,

Many thanks, Michael

yuukiiwa commented 1 year ago

Hi @Michael-m6A,

Glad that it works out well!!

Best wishes, Yuk Kei