nloyfer / wgbs_tools

tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
Other
134 stars 37 forks source link

Question regarding generating bigger blocks in brain .beta files wgbstools segment #61

Open DieStok opened 10 months ago

DieStok commented 10 months ago

Hi there. Thanks for this great suite of tools. I wanted to create larger, more contiguous blocks solely focused on the patterin brain cells. For this, I subset your .beta files to those describing neurons and oligodendrocytes, and tried to run wgbstools segment with much higher max_bp and max_cpg, with different chunk sizes. Unfortunately, I keep running into the error

max_bp = 50_000
max_cpg = 50_000
output_path = f'/home/cog/dstoker/sharedprojects/sturgeon/sturgeoff/analysis/dstoker/data/brain_specific_blocks/brain_specific_blocks_maxbp_{max_bp}_maxcpg_{max_cpg}.bed'
CHUNK_SIZE = 30_000
#standard argument for threads uses all available cpu
subprocess.run(f'wgbstools segment --max_bp {max_bp} --max_cpg {max_cpg} --beta_file {file_path_out_beta_selection_brain_blocks} -o {output_path} -c {CHUNK_SIZE} --genome hg38', shell = True)

The contents of brain_specific_blocks is just:

/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta
/home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta

Which are indeed the file paths of .beta files of brain-derived cell types.

Regardless of the genome I use, and with some fiddling with chunk size and max_bp etc. I keep getting errors of the sort nr_sites != number of loci 137 != 0. I see this error is thrown in the C implementation of chunking but I don't quite know what's going wrong, and different chunk sizes don't really seem to solve the issue.

Full error log:

/bin/sh: line 1: 3419005 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr1:1-133
     3419007                       | cut -f2
     3419008 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 0 -n 133 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (1, 134)
Error: nr_sites != number of loci: 6 != 0. Try different chunck size!
terminate called after throwing an instance of 'int'
/bin/sh: line 1: 3419004 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr3:312-523
     3419006                       | cut -f2
     3419009 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 311 -n 212 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (312, 524)
Error: nr_sites != number of loci: 108 != 0. Try different chunck size!
terminate called after throwing an instance of 'int'
/bin/sh: line 1: 3419018 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr5:792-797
     3419019                       | cut -f2
     3419020 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 791 -n 6 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (792, 798)
Error: nr_sites != number of loci: 137 != 0. Try different chunck size!
terminate called after throwing an instance of 'int'
/bin/sh: line 1: 3419029 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr9:833-940
     3419030                       | cut -f2
     3419031 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 832 -n 108 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (833, 941)
Error: nr_sites != number of loci: 122 != 0. Try different chunck size!
terminate called after throwing an instance of 'int'
/bin/sh: line 1: 3419058 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr18:1301-1422
     3419059                       | cut -f2
     3419060 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 1300 -n 122 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (1301, 1423)
/bin/sh: line 1: 3419041 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr12:1062-1198
     3419042                       | cut -f2
     3419043 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 1061 -n 137 -max_cpg 25000 -ps 15 -max_bp 50000
Error: nr_sites != number of loci: 35 != 0. Try different chunck size!
terminate called after throwing an instance of 'int'
Failed in sites (1062, 1199)
/bin/sh: line 1: 3419136 Done                    tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chrY:1458-1492
     3419137                       | cut -f2
     3419138 Aborted                 (core dumped) | /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 1457 -n 35 -max_cpg 25000 -ps 15 -max_bp 50000
Failed in sites (1458, 1493)
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/python/segment.py", line 58, in segment_process
    raise e
  File "/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/python/segment.py", line 53, in segment_process
    brd_str = subprocess.check_output(cmd, shell=True).decode().split()
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr1:1-133 | cut -f2 |/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 0 -n 133 -max_cpg 25000  -ps 15 -max_bp 50000 ' returned non-zero exit status 134.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/cog/dstoker/sharedprojects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/wgbstools", line 97, in <module>
    main()
  File "/home/cog/dstoker/sharedprojects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/wgbstools", line 64, in main
    importlib.import_module(args.command).main()
  File "/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/python/segment.py", line 310, in main
    SegmentByChunks(args, betas).run()
  File "/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/python/segment.py", line 136, in run
    arr = p.starmap(segment_process, params)
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/hpc/compgen/users/dstoker/Software/anaconda3/envs/sturgeon_env/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
subprocess.CalledProcessError: Command 'tabix /hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/references/hg38/rev.CpG.bed.gz chr1:1-133 | cut -f2 |/hpc/compgen/projects/generative_methylation_project/analysis/dstoker/software/wgbs_tools/src/segment_betas/segmentor /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652219_Oligodendrocytes-Z000000TK.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652220_Oligodendrocytes-Z0000042E.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652221_Oligodendrocytes-Z0000042L.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652222_Oligodendrocytes-Z0000042N.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652223_Cortex-Neuron-Z000000TF.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652224_Neuron-Z000000TH.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652225_Cortex-Neuron-Z0000042F.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652226_Cortex-Neuron-Z0000042H.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652227_Cortex-Neuron-Z0000042J.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652228_Cortex-Neuron-Z0000042M.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652229_Cortex-Neuron-Z0000042P.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652230_Cortex-Neuron-Z0000042K.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652231_Cerebellum-Neuron-Z000000TB.beta /home/cog/dstoker/sharedprojects/generative_methylation_project/raw/all_supplementary_files/unpacked/GSM5652232_Cortex-Neuron-Z000000TD.beta -s 0 -n 133 -max_cpg 25000  -ps 15 -max_bp 50000 ' returned non-zero exit status 134.

I would be grateful for any pointers, and will keep fiddling more with chunk_size (as recommended but did not easily seem to fix the issue) in the mean time. Thanks again!

yonniejon commented 1 month ago

Are you using the files downloaded from the paper? I believe these files are using the hg19 reference genome while you called --genome hg38 in your command. Let me know if this helps