pogorely / ALICE

Detecting TCR involved in immune responses from single RepSeq datasets
GNU General Public License v3.0
25 stars 13 forks source link

Some bugs(?) #2

Closed wyattmcdonnell closed 6 years ago

wyattmcdonnell commented 6 years ago

Hey Misha!

This is very exciting work—congratulations on pulling all of this together!

I'm able to replicate your results from start to finish using your sample code and the files under /samples. Here's a link to some files I'm using to test ALICE, and which appear to break things in a few ways—I made sure everything followed IMGT nomenclature (TRBV, TRBJ), removed columns not found in your example .tsv files, and made sure that the headers/column names matched yours too. A few things popped up when I tried to use some different labels:

pre <- fread(input = '~/Downloads/test.Brain_2.txt')
post <- fread(input = '~/Downloads/test.Brain_1.txt')
test <- list(d0=pre,d15=post)
test_alice<-ALICE_pipeline(DTlist=test,folder="test_res",cores=8,iter=10,nrec=5e5)

Error in 1:nrow(VJlist) : argument of length 0
Called from: compute_pgen_rda_folder(folder, cores = cores, nrec = nrec, iter = iter)
Browse[1]> 

Next, I tried renaming things to match your sample .tsv files closely, like this, which threw a different error, but appeared to get further in your code:

S1d0 <- fread(input = '~/Downloads/test.Brain_2.txt')
S1d15 <- fread(input = '~/Downloads/test.Brain_1.txt')
S1 <- list(d0=S1d0,d15=S1d15)
S1_alice<-ALICE_pipeline(DTlist=S1,folder="test_res",cores=8,iter=10,nrec=5e5)

Error in D > D_thres : 
  comparison (6) is possible only for atomic and list types
Called from: FUN(X[[i]], ...)
Browse[1]> 

Some info about the machine and version I'm using:

platform       x86_64-apple-darwin15.6.0   
arch           x86_64                      
os             darwin15.6.0                
system         x86_64, darwin15.6.0        
status                                     
major          3                           
minor          4.4                         
year           2018                        
month          03                          
day            15                          
svn rev        74408                       
language       R                           
version.string R version 3.4.4 (2018-03-15)
nickname       Someone to Lean On   

And some version info for ALICE's dependencies:

> packageVersion('stringdist')
[1] ‘0.9.4.6’
> packageVersion('igraph')
[1] ‘1.1.2’
> packageVersion('data.table')
[1] ‘1.10.4.3’
> packageVersion('Biostrings')
[1] ‘2.44.2’

Please let me know how I can help troubleshoot further—and thanks again, this is very exciting work!

Cheers, Wyatt

pogorely commented 6 years ago

Dear Wyatt,

Thank you very much for this detailed bug report. I've reproduced your error, there was a minor bug indeed (this error occurs if there were no significant results found). This is now fixed. Also note, that in your test.Brain_1.txt it founds significant results, only if you change the Read.count threshold _Readthres to 0 (by default ALICE filters singletons out for additional error correction, you could change this in the new version).

Please update ALICE.R and try this: pre <- fread(input = '~/Downloads/test.Brain_2.txt') post <- fread(input = '~/Downloads/test.Brain_1.txt') test<-list(d0=pre,d15=post) test_alice<-ALICE_pipeline(DTlist=test,folder="test_res0",cores=1,iter=1,nrec=5e5,Read_thres = 0) #algorithm will use everything with Read.count>0, the default is 1 test_alice

Best, Misha

wyattmcdonnell commented 6 years ago

Hey Misha!

Happy to report that the latest update of ALICE fixes this on my machine, and also that I was able to run the sample code you provided above! Keep up the great work!

Best wishes, Wyatt

EdGreen21 commented 5 years ago

Hi Misha, Hi Wyatt,

I ran into the same problem with my files - so I looked at issue reports and found this. I downloaded Wyatt's files and pulled the latest version of ALICE.R from Github to see if the problem was specific to me.

I ran the code as Misha suggested in this issue report using Wyatt's files and substituting Read_count_filter=0' for 'Read_thres = 0 as this variable name has changed since this issue was raised, however I still get the error Error in 1:nrow(VJlist) : argument of length 0

Has there perhaps been an error reversion here? I went through the recent commits but I haven't yet managed to solve this, so I thought I'd reach out and see if you guys had a solution - we're definitely excited to try this on our samples!

UPDATE: this breaks after commit e193819d

Best

Ed

wyattmcdonnell commented 5 years ago

Hi Ed,

I haven't updated my version of ALICE.R on this machine, and am still able to run everything without any errors popping up (and all output looks as it should). To me this suggests a possible error reversion, but Misha is absolutely more qualified to answer this than I am.

Cheers, Wyatt

EdGreen21 commented 5 years ago

Cheers Wyatt, I can run your files on commit e193819 and get a result, and the code doesn't crash using commit c2a19f05 (this requires correcting an open if statement at the end) but doesn't generate any output.

Hi Ed,

I haven't updated my version of ALICE.R on this machine, and am still able to run everything without any errors popping up (and all output looks as it should). To me this suggests a possible error reversion, but Misha is absolutely more qualified to answer this than I am.

Cheers, Wyatt

pogorely commented 5 years ago

Hi Ed!

Sorry, indeed this bug recurred in the latest version and it is now fixed again. Please tell me if it worked for you (I've checked it works for files from Wyatt,note different parameter names for Read.count thresholds): pre <- fread(input = '~/Downloads/ALICE/test.Brain_2.txt'); post <- fread(input = '~/Downloads/ALICE/test.Brain_1.txt'); test<-list(d0=pre,d15=post); test_alice<-ALICE_pipeline(DTlist=test,folder="test_res0",cores=1,iter=1,nrec=5e5,Read_count_filter=0,Read_count_neighbour =1)

Best, Misha

EdGreen21 commented 5 years ago

Hi @pogorely , I just git pulled #6a840ff, checked the 2 edits were in ALICE.R, re-downloaded Wyatt's files and re-sourced ALICE.R but I still get the same error: Error in 1:nrow(VJlist) : argument of length 0. I'll try from a fresh install

EdGreen21 commented 5 years ago

So I observed some odd behaviour when altering the Read_count_neighbour parameter: initially set to 1 it caused a fault, but setting it to 0 then allowed the code to complete. However, changing this back to 1 and re-running allowed it to complete:

> test_alice <- ALICE_pipeline(DTlist=test,
+                              folder="test_res0",
+                              cores=1,
+                              iter=1,
+                              nrec=5e5,
+                              Read_count_filter=0,
+                              Read_count_neighbour=1)
Error in 1:nrow(VJlist) : argument of length 0
> test_alice <- ALICE_pipeline(DTlist=test,
+                              folder="test_res0",
+                              cores=1,
+                              iter=1,
+                              nrec=5e5,
+                              Read_count_filter=0,
+                              Read_count_neighbour=0)
> View(test_alice)
> test_alice <- ALICE_pipeline(DTlist=test,
+                              folder="test_res0",
+                              cores=1,
+                              iter=1,
+                              nrec=5e5,
+                              Read_count_filter=0,
+                              Read_count_neighbour=1)

This would suggest either 1 - some variables are kept between runs 2 - there's a stochastic element in the calculations that ensures that in this dataset it's a borderline call whether a TCR motif is called as significant or not. Increasing iterations and nrec would probably fix this, but having the code error more robustly would be really helpful for cases when many samples are analysed at once.

UPDATE I did this again, and now I seem to get clean fails... will keep trying until I can get a reproducible error

Florian411 commented 2 years ago

Is there any news on this? I am having the same problem with the current version of the package.