Error logical subscript too long when training data

CenterForMedicalGeneticsGhent / PREFACE

PREFACE -- PREdict FetAl ComponEnt

GNU General Public License v3.0

14 stars 5 forks source link

Error logical subscript too long when training data #5

Open KhanhLPBao opened 3 years ago

KhanhLPBao commented 3 years ago

I'm trying to train a new batch of data, after loading all .bed file the script show this error and stop

Error in training.frame.sub["X" == training.frame$chr, ] : 
  (subscript) logical subscript too long
Calls: train -> as.data.frame
Execution halted

Does anyone know how to fix it? Thank you very much

leraman commented 3 years ago

I'm guessing one or more of your bed files have an different number of lines per chromosome? Could you double check they were all created with the same WCX reference?

KhanhLPBao commented 3 years ago

Sorry for not reply for long time, I will check all files again. Thank you very much

KhanhLPBao commented 3 years ago

@leraman I have rerun entire entire samples, check their lines and they are the same but the problem still occured. My procedure including:

Convert .bam to .npz by using WisecondorX
Make the .bed by WisecondorX predict, the reference is the total male sample I made with command:

WisecondorX newref /path/to/samples sampleM.npz --yfrac 0 (I set yfrac = 0 to set all samples are male)

Create .bed file that PREFACE can read by command

cut -f1,2,3,5 _bins.bed > .bed

leraman commented 3 years ago

You should use --nipt during WisecondorX newref if you want to create a NIPT reference.

KhanhLPBao commented 3 years ago

@leraman Hi I have tried to use --nipt but its returned with this output:

A NIPT reference should have at least 5 female feti samples. Removing --nipt flag.

I know the Y chromosome depend on the fetal fraction on the sample, but is there any way can I do to calculate ff based on Y fraction? Because as I mentioned in issue #6 the results are just repeating of 2-3 numbers and the r is very low.

leraman commented 3 years ago

For NIPT, you cannot analyze the y chromosome with WisecondorX, so you'll need to use --nipt and don't use --yfrac. We haven't provided code to calculate FFY, but there are multiple possibilities. You'll have to do some custom parsing on the bam files to define an FFY that best fits your protocol.

cverwimp commented 3 years ago

I encountered the same error, fixed it by replacing line 191 of PREFACE.R with:

training.frame <- training.frame[training.frame$chr != 'Y', ]

leraman commented 3 years ago

Hi @cverwimp

Did you use WisecondorX (if yes: you should use the --nipt flag, which makes sure there is no Y data) or some other software for copy number profiling?

Thanks,

Lennart

happenywong commented 3 years ago

I'm trying to train a new batch of data, after loading all .bed file the script show this error and stop
Error in training.frame.sub["X" == training.frame$chr, ] : 
  (subscript) logical subscript too long
Calls: train -> as.data.frame
Execution halted
Does anyone know how to fix it? Thank you very much

I met the same problem, how did you solve it?

happenywong commented 3 years ago

I encountered the same error, fixed it by replacing line 191 of PREFACE.R with:

training.frame <- training.frame[training.frame$chr != 'Y', ]

I used your method to solve the problem，thank！