dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

Error at step 1 with a chunk #438

Closed simjoly closed 3 years ago

simjoly commented 3 years ago

Hi,

I got the following error when running step 1 after (or at the very end of) the "writing/compressing" step.

-------------------------------------------------------------
  ipyrad [v.0.9.62]
  Interactive assembly and analysis of RAD-seq data
 -------------------------------------------------------------
  Parallel connection | blg5814.int.ets1.calculquebec.ca: 24 cores

  Step 1: Demultiplexing fastq data to Samples
    [####################] 100% 1:05:20 | chunking large files
    [####################] 100% 0:54:32 | sorting reads
    [####################] 100% 1:16:03 | writing/compressing

  Encountered an Error.
  Message: [Errno 22] Invalid argument: 'chunk1_0_aaag'
  Parallel connection closed.

I have checked and the temp file 'chunk1_0_aaag' is present in the 'tmpdir' folder.

No .json file was created and I am not too sure what to do from there. All the sample files appear to have been created, so could I force step 2 from what I have?

I tried to run the program again in a different folder, but got the same error.

Thanks for your help! Simon

isaacovercast commented 3 years ago

Hello, If step 1 fails it's not possible to skip to step 2. When you say "all the sample files appear to have been created" what do you mean? Can you show me an ls -ltr *_fastqs | head to list a few of the sample files in this directory? Is there a chance you are running out of disk space? Can you verify you have enough disk? df -h. This feels like a disk space issue.

simjoly commented 3 years ago

Hello,

I doubt is it disk space, unless I am limited on my personal account. This is the status of the scratch disk on the cluster:

File System                              Size Used Avail Avail% Mounted on
10.72.112.13@o2ib:10.72.112.14@o2ib:/lustre04   2,6P    1,9P  757T  72% /lustre04

Regarding the files, I have 56 samples with paired-ends reads. Here is the head of the content of the fastqs folder:

total 81273873
-rw-r-----. 1 jolysimo jolysimo  475805076 16 mar 09:35 GES_quisqueyana2_R1_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  521225084 16 mar 09:37 GES_quisqueyana2_R2_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  514307321 16 mar 09:37 GES_jamaicensis_R1_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  541578245 16 mar 09:38 GES_jamaicensis_R2_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo 5201843812 16 mar 09:44 RHY_bicolor1_R1_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo 5539756470 16 mar 09:47 RHY_bicolor1_R2_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  426906474 16 mar 09:48 GES_sylvicola1_R1_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  447604712 16 mar 09:48 GES_sylvicola1_R2_.fastq.gz
-rw-r-----. 1 jolysimo jolysimo  273803584 16 mar 09:49 RHY_leucomallon_R1_.fastq.gz

All 112 files are present. The demultiplex file also seems complete.

In case it is of interest, here is the content of the *_fastqs/tmpdir folder, which is where the files that seemed to be causing trouble is:

total 184035294
-rw-r-----. 1 jolysimo jolysimo 17246774894 16 mar 07:36 chunk1_0_aaaa
-rw-r-----. 1 jolysimo jolysimo 17246632711 16 mar 07:38 chunk1_0_aaab
-rw-r-----. 1 jolysimo jolysimo 17246563171 16 mar 07:40 chunk1_0_aaac
-rw-r-----. 1 jolysimo jolysimo 17246650716 16 mar 07:42 chunk1_0_aaad
-rw-r-----. 1 jolysimo jolysimo 17246625961 16 mar 07:43 chunk1_0_aaae
-rw-r-----. 1 jolysimo jolysimo 17246606550 16 mar 07:46 chunk1_0_aaaf
-rw-r-----. 1 jolysimo jolysimo 17246484891 16 mar 07:48 chunk1_0_aaag
-rw-r-----. 1 jolysimo jolysimo 17246806138 16 mar 07:50 chunk1_0_aaah
-rw-r-----. 1 jolysimo jolysimo 17246678293 16 mar 07:52 chunk1_0_aaai
-rw-r-----. 1 jolysimo jolysimo 17246652798 16 mar 07:56 chunk1_0_aaak
-rw-r-----. 1 jolysimo jolysimo 17246641113 16 mar 07:59 chunk1_0_aaal
-rw-r-----. 1 jolysimo jolysimo 17246533694 16 mar 08:05 chunk1_0_aaam
-rw-r-----. 1 jolysimo jolysimo 17246645078 16 mar 08:06 chunk1_0_aaan
-rw-r-----. 1 jolysimo jolysimo 17246291493 16 mar 08:08 chunk1_0_aaao
-rw-r-----. 1 jolysimo jolysimo  3498423580 16 mar 08:08 chunk1_0_aaap
-rw-r-----. 1 jolysimo jolysimo 17246774894 16 mar 08:10 chunk2_0_aaaa
-rw-r-----. 1 jolysimo jolysimo 17246563171 16 mar 08:14 chunk2_0_aaac
-rw-r-----. 1 jolysimo jolysimo 17246650716 16 mar 08:17 chunk2_0_aaad
-rw-r-----. 1 jolysimo jolysimo 17246625961 16 mar 08:19 chunk2_0_aaae
-rw-r-----. 1 jolysimo jolysimo 17246606550 16 mar 08:22 chunk2_0_aaaf
-rw-r-----. 1 jolysimo jolysimo 17246484891 16 mar 08:24 chunk2_0_aaag
-rw-r-----. 1 jolysimo jolysimo 17246678293 16 mar 08:28 chunk2_0_aaai
-rw-r-----. 1 jolysimo jolysimo 17246662304 16 mar 08:29 chunk2_0_aaaj
-rw-r-----. 1 jolysimo jolysimo 17246652798 16 mar 08:31 chunk2_0_aaak
-rw-r-----. 1 jolysimo jolysimo 17246641113 16 mar 08:33 chunk2_0_aaal
-rw-r-----. 1 jolysimo jolysimo 17246533694 16 mar 08:36 chunk2_0_aaam
-rw-r-----. 1 jolysimo jolysimo 17246645078 16 mar 08:38 chunk2_0_aaan
-rw-r-----. 1 jolysimo jolysimo 17246291493 16 mar 08:39 chunk2_0_aaao
-rw-r-----. 1 jolysimo jolysimo  3498423580 16 mar 08:40 chunk2_0_aaap
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:13 tmp_21434_14.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 08:53 tmp_21434_5.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:33 tmp_21435_10.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 08:53 tmp_21435_2.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 08:53 tmp_21440_1.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:16 tmp_21440_7.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:34 tmp_21479_11.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:13 tmp_21479_8.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 08:53 tmp_21491_0.p
-rw-r-----. 1 jolysimo jolysimo        4305 16 mar 09:00 tmp_21491_15.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:15 tmp_21491_9.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:32 tmp_21586_12.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 08:53 tmp_21586_4.p
-rw-r-----. 1 jolysimo jolysimo        4359 16 mar 09:12 tmp_21586_6.p

Thanks!

isaacovercast commented 3 years ago

Is the s1_demultiplex_stats.txt stats file created inside the fastqs directory? It could be your cluster is having some kind of issue with unlinking files (cleaning up the tmp files after the process completes). I don't know why this would be, but HPC systems have a funny way of acting sometimes. If the S1 file is there, then you can try blanking the raw_fastqs_path and barcodes_path parameters and setting the sorted_fastq_path parameter to point to the fastqs directory. You'll have to re-run step 1, but in this case it will only read in the files that already exist, so it should work. I imagine even if this works that there'll be further problems with deleting files downstream, but this is at least worth trying.

simjoly commented 3 years ago

I think this makes sense. When I tried to delete files manually (from the previous run), I got an error of 'Invalid argument' and could not delete some files. Yes, the s1_demultiplex_stats.txt file is there. I'll try what you suggest and will let you know if it works.

isaacovercast commented 3 years ago

If you can't delete files by hand then ipyrad is definitely not going to be able to delete them ;) I think you should talk to your cluster admin folks to see if you can figure out why you're getting this invalid argument thing. ipyrad creates and deletes lots of temporary files, so you are going to have this problem again, and you won't be able to work around it past step 1.

simjoly commented 3 years ago

Ok, I'll do this! Thanks.

simjoly commented 3 years ago

So I haven't found the source of the problem, but the workaround worked fine. So I'll close this issue here. I'll let you know if it happens again.