molgenis / NIPTeR

R Package for Non Invasive Prenatal Testing (NIPT) analysis
GNU Lesser General Public License v3.0
40 stars 16 forks source link

add_samples_controlgroup always Error: More than one strand type in control group #22

Open KhanhLPBao opened 3 years ago

KhanhLPBao commented 3 years ago

I want to add sample into already created control group, but every time it returned Error: More than one strand type in control group. My code for create control group is like what I read on manual

Library(NIPTeR)
bam_filepath <- list.files(path = "path/to/control/folder", pattern = "sorted.bam", full.names = T)
control_group <- as_control_group(nipt_samples = lapply(X = bam_filepaths, bin_bam_sample, do_sort = F, separate_strands = FALSE))
saveRDS(object = control_group, file = "path/clean_ref.rds")

This is the code I use to add sample to control group, also from the manual

library(NIPTeR)
new_sample <- bin_bam_sample(bam_filepath = '/path/to/sample.sorted.bam', do_sort = F,  separate_strands = FALSE)
control_group <- readRDS(file = 'path/clean_ref.rds',)
new_control <- add_samples_controlgroup(nipt_control_group = control_group, samples_to_add = new_sample)
saveRDS(object = new_control, file = "/home/testmachine/Documents/test/control_group.rds")

It always returned "Error: More than one strand type in control group", can anyone explain for me what happened? Thank you very much

ljohansson commented 3 years ago

Dear Khan,

I would expect this error if there is a discrepancy in "separate_strands" between samples, with some being set to FALSE and others being set to TRUE. A control_group can only contain a single strand type.

Regards, Lennart

KhanhLPBao commented 3 years ago

Dear Khan,

I would expect this error if there is a discrepancy in "separate_strands" between samples, with some being set to FALSE and others being set to TRUE. A control_group can only contain a single strand type.

Regards, Lennart

Thank you very much, does anything I can do to see what sample on control group has it set to TRUE and how to correct it? Or I have to bin_bam_sample and add it to control group for each .bam file?

ljohansson commented 3 years ago

Each sample object will contain the strand information. You can create a control group using a single command, but you can also create sample objects separately and join them later on in a control group by adding them. Using this approach you could see which sample fails. For more information see the vignette: https://cran.r-project.org/web/packages/NIPTeR/vignettes/NIPTeR.html

KhanhLPBao commented 3 years ago

Thank you very much

KhanhLPBao commented 3 years ago

Hi @ljohansson I have to reopened this topic because I'm running out of any ideas how to make add_sample_controlgroup work. In more than 5 months I tried multiple ways to alignments, file treatment and other ways but the result. I will send you the rds of reference file (10 samples) and the bin_bam file of the 11th sample. I'm desperately running out of time, if you know what error is or how can I merge multiple rds made from bin_bam_sample together to make a reference file please help me. https://www.dropbox.com/s/gmjl9d2prhre5ut/samples.zip?dl=0 I have to post Dropbox link since by somehow I cannot attach .zip file to github

ljohansson commented 3 years ago

Dear @KhanhLPBao I indeed also see the error message. I am not sure what is the cause, because all samples are of the combinedStrands type. However, when creating the control_group from scratch I did not encounter the issue:

First I got all individual samples from the control group: e.g. for sample 1: controlsample1 <- controls$samples[[1]]

Then I created a list with all samples: list_all_11 <- list(controlsample1, controlsample2, controlsample3,controlsample4, controlsample5, controlsample6, controlsample7, controlsample8, controlsample9, controlsample10,sample)

Then I created the control group: controls_11 <- NIPTeR::as_control_group(nipt_samples = list_all_11)

Would this approach work for you?

Regard, Lennart

KhanhLPBao commented 3 years ago

Dear @ljohansson Sorry for long time not responding, its worked, thank you very much. But how can I create a list automatically? I have more than 1000 samples, all I can do is bin_bam_sample it and saved as .rds file because my RAM cannot handle all of them during add_sampe_controlgroup.

ljohansson commented 3 years ago

Dear KahnLPBao, No problem. I understand that the route through making the control group at once does not work on your system/with your data. I suggest to run the bin_bam_sample step per sample and create an rds file for each sample and then add them one by one (this would be a 1000 steps then). Or is this what doesn't work. You could also consider creating a smaller control group. I believe that if you take a random subset of your 1000 samples, with 300 samples or so you will have similar results as compared to running with all samples as controls.

KhanhLPBao commented 3 years ago

Hi @ljohansson Can I use the below commands to make the autoscript?

library(NIPTeR)
ref = readRDS('reference.rds')
newmem = readRDS('sample.rds')
list = c(ref$sample, list(newmem))
newref = NIPTeR::as_control_group(nipt_samples = list)
saveRDS(newref,'reference.rds')