dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

Erro engine #533

Closed nguzman14 closed 11 months ago

nguzman14 commented 11 months ago

I was traing to run step 7 but there was an inconvenient: Step 7: Filtering and formatting output files [####################] 100% 0:00:53 | applying filters
[####################] 100% 0:08:40 | building arrays

Encountered an Error. Message: EngineError: Engine b'1bbf1484-3142a42ff02804fdb15f35f0' died while running task 'e99ee339-2c9902b3ad0c7e8f544ef409_29185_98' warning: error during shutdown: [Errno 3] No such process

Con you help me?

isaacovercast commented 11 months ago

Hello, 'EngineError" is almost 100% of the time a RAM issue. Try to allocate more ram or reduce the number of cores to spread the available ram over fewer cores. Please try to increase ram or reduce cores (or both) and let me know how it goes.

nguzman14 commented 11 months ago

Thanks, I cant allocate more ram and when I reduce cores it shows the same error. Only when I adjust some more restrictive parameters it works. For example de mincov of 30 and a max snp of 0.05. I have 185 paired end samples. I copy my parameter file in case it helps to find a solution. Thanks in advance. ------- ipyrad params file (v.0.9.93)------------------------------------------- trimero ## [0] [assembly_name]: Assembly name. Used to name output directories for assembly steps /home/noelia/ipyrad ## [1] [project_dir]: Project dir (made in curdir if not present)

[2] [raw_fastq_path]: Location of raw non-demultiplexed fastq files

                           ## [3] [barcodes_path]: Location of barcodes file

/home/noelia/ipyrad/fastq_tri/*.fq.gz ## [4] [sorted_fastq_path]: Location of demultiplexed/sorted fastq files denovo ## [5] [assembly_method]: Assembly method (denovo, reference)

[6] [reference_sequence]: Location of reference sequence file

pairddrad ## [7] [datatype]: Datatype (see docs): rad, gbs, ddrad, etc. AGCTT,CATGC ## [8] [restriction_overhang]: Restriction overhang (cut1,) or (cut1, cut2) 5 ## [9] [max_low_qual_bases]: Max low quality base calls (Q<20) in a read 33 ## [10] [phred_Qscore_offset]: phred Q score offset (33 is default and very standard) 6 ## [11] [mindepth_statistical]: Min depth for statistical base calling 6 ## [12] [mindepth_majrule]: Min depth for majority-rule base calling 10000 ## [13] [maxdepth]: Max cluster depth within samples 0.85 ## [14] [clust_threshold]: Clustering threshold for de novo assembly 2 ## [15] [max_barcode_mismatch]: Max number of allowable mismatches in barcodes 2 ## [16] [filter_adapters]: Filter for adapters/primers (1 or 2=stricter) 35 ## [17] [filter_min_trim_len]: Min length of reads after adapter trim 2 ## [18] [max_alleles_consens]: Max alleles per site in consensus sequences 0.05 ## [19] [max_Ns_consens]: Max N's (uncalled bases) in consensus 0.05 ## [20] [max_Hs_consens]: Max Hs (heterozygotes) in consensus 30 ## [21] [min_samples_locus]: Min # samples per locus for output 0.05 ## [22] [max_SNPs_locus]: Max # SNPs per locus 8 ## [23] [max_Indels_locus]: Max # of indels per locus 0.5 ## [24] [max_shared_Hs_locus]: Max # heterozygous sites per locus 5, 75, 5, 75 ## [25] [trim_reads]: Trim raw read edges (R1>, <R1, R2>, <R2) (see docs) 0, 0, 0, 0 ## [26] [trim_loci]: Trim locus edges (see docs) (R1>, <R1, R2>, <R2) p, s, u, n, k, g, v, G, a ## [27] [output_formats]: Output formats (see docs)

[28] [pop_assign_file]: Path to population assignment file

                           ## [29] [reference_as_filter]: Reads mapped to this reference are removed in step 3
isaacovercast commented 11 months ago

Ok glad it works. It makes sense that increasing the min_samples_locus fixes the problem, because this will effectively reduce the amount of data retained, which reduces the stress on the RAM.