Open gibcus opened 5 years ago
This is strange. Is there something after the header in the pairs file?
bwa index
is referring to chromosomes with names, like: ref|NC_001323.1|, ref|NC_006088.5|, ...
etcreduced.chomsizes
file , however, refers to "human readable" chr1 , chr2, chr3,
etcCan you trace back how you created bwa index
and reduced.chromsizes
? Did you use the same fasta
as input ?
There can be different chromosome names in the index and reduced.chromsize but there must be an overlap as well !
example:
bwa index
created using "normal" chroms + a lot of contigs;reduced.chromsizes
referring to "normal" chroms only
In this example your mapping would be done using normal+contigs, thus reducing mapping ambiguity (or increasing sometimes?); and your .pairs
would contain reads mapped to the contigs, at the same time "heatmaps"-coolers are going to be build WITHOUT the contigs, just the "normal" chromosomes only.Another unrelated problem in your distiller run is this: --resolutions 1000000,500000,250000,100000,50000,25000,10000
You're asking cooler
to build 25kb heatmaps based on 10kb ones - that is probably not going to "fly" , even after you fix your reference genome:
resolutions in the "ladder" must be multiples of the highest-one (smallest bin size-one) - because all lower resolution "heatmaps" are build upon the highest one by consecutive coarsening.
From the header, it like your chromosomes in the pairs file use ref|NC_xxx
names instead of UCSC names (chr...
). That must have been how they were encoded in the FASTA file.
You can confirm by checking after the header, as Max suggested.
If that's the case, your options are:
Options that involve manual intervention (for a one-off case), or modifying the pipeline:
Yup,
used NCBI fasta to generate reduced.chomsizes
file.
I'll generate a new one from UCSC, I guess. and try @nvictus's suggestion for re-index, and re-distill.
Alternatively, I'll remap the whole d@rn thing.
"... Another unrelated problem in your distiller run is this: --resolutions 1000000,500000,250000,100000,50000,25000,10000 You're asking cooler to build 25kb heatmaps based on 10kb ones - that is probably not going to "fly" , even after you fix your reference genome: resolutions in the "ladder" must be multiples of the highest-one (smallest bin size-one) - because all lower resolution "heatmaps" are build upon the highest one by consecutive coarsening."
Another rookie mistake...
I recommend downloading the 2bit file from UCSC goldenpath. The twoBitInfo
command will dump the chromosomes in a sensible order (not sorted by size), and twoBitToFa
will generate the fasta.
EDIT: Just tested. Scratch the sensible order statement... maybe it was just a fluke the last couple genomes I tried it on.
I recommend downloading the 2bit file from UCSC goldenpath. The twoBitInfo command will dump the chromosomes in a sensible order (not sorted by size), and twoBitToFa will generate the fasta.
I considered the "soft masked": galGal6.fa.gz
, but I'll check twoBitToFA
Also, I generally start with 1kb resolution, not 10kb. It does not generate that much extra space, but may end up being useful for averages/pileups even in low-coverage datasets.
On Fri, Mar 29, 2019 at 2:24 PM Johan Gibcus notifications@github.com wrote:
I recommend downloading the 2bit file from UCSC goldenpath. The twoBitInfo command will dump the chromosomes in a sensible order (not sorted by size), and twoBitToFa will generate the fasta. I considered the "soft masked": galGal6.fa.gz, but I'll check twoBitToFA
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mirnylab/distiller-nf/issues/137#issuecomment-478081995, or mute the thread https://github.com/notifications/unsubscribe-auth/AJBEe2lrJueo3IuDLs5oeoj9Z44buGunks5vbkwzgaJpZM4cSzrp .
Also, I generally start with 1kb resolution, not 10kb. It does not generate that much extra space, but may end up being useful for averages/pileups even in low-coverage datasets.
Indeed that was a space consideration, as the libraries did not have 1kb depth. I'll take your advice!
dis.out contents:
N E X T F L O W ~ version 19.01.0 Launching
dekkerlab/distiller-nf
[mad_edison] - revision: 5f5b40f0c7 [ghpcc] [warm up] executor > lsf [76/2f5f3c] Submitted process > local_truncate_chunk_fastqs (library:ICRF-12min-S2-R2galGal6 run:lane1) [99/257402] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2__galGal6 run:lane1 chunk:03) [f0/613fcd] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:01) [80/0ae835] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:05) [c1/0329d9] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2__galGal6 run:lane1 chunk:04) [a8/268854] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:08) [a0/efa5f1] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:02) [11/20e124] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2__galGal6 run:lane1 chunk:09) [c8/7c0943] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:06) [16/c87ea8] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:07) [1d/040136] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2__galGal6 run:lane1 chunk:11) [e5/a8a883] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:12) [e6/7dc0ec] Submitted process > map_parse_sort_chunks (library:ICRF-12min-S2-R2galGal6 run:lane1 chunk:10) [17/3f266e] Submitted process > merge_dedup_splitbam (library:ICRF-12min-S2-R2galGal6) [96/a57503] Submitted process > bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:no_filter) [d8/9d8b0c] Submitted process > bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:mapq_30) [e0/acfb4b] Submitted process > merge_stats_libraries_into_groups (library_group:ICRF-12m-R2) [0c/a48dde] Submitted process > merge_stats_libraries_into_groups (library_group:all) [96/a57503] NOTE: Processbin_zoom_library_pairs (library:ICRF-12min-S2-R2__galGal6 filter:no_filter)
terminated with an error exit status (1) -- Execution is retried (1) [42/b06c9f] Re-submitted process > bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:no_filter) [42/b06c9f] NOTE: Process `bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:no_filter)` terminated with an error exit status (1) -- Execution is retried (2) [56/efd5ad] Re-submitted process > bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:no_filter) ERROR ~ Error executing process > 'bin_zoom_library_pairs (library:ICRF-12min-S2-R2galGal6 filter:no_filter)'Caused by: Process
bin_zoom_library_pairs (library:ICRF-12min-S2-R2__galGal6 filter:no_filter)
terminated with an error exit status (1)Command executed:
bgzip -cd -@ 3 ICRF-12min-S2-R2galGal6.galGal6.nodups.pairs.gz | cooler cload pairs -c1 2 -p1 3 -c2 4 -p2 5 --assembly galGal6 galGal6.reduced.chrom.sizes:10000 - ICRF-12min-S2-R2galGal6.galGal6.no_filter.10000.cool
cooler zoomify --nproc 12 --out ICRF-12min-S2-R2__galGal6.galGal6.no_filter.10000.mcool --resolutions 1000000,500000,250000,100000,50000,25000,10000 --balance ICRF-12min-S2-R2__galGal6.galGal6.no_filter.10000.cool
Command exit status: 1
Command output: (empty)
Command error: INFO:cooler.create:Writing bins INFO:cooler.create:Writing pixels INFO:cooler.create:Writing indexes INFO:cooler.create:Writing info INFO:cooler.create:Done INFO:cooler.create:Writing chunk 8: /tmp/tmp6loc8rar.multi.cool::8 INFO:cooler.create:Creating cooler at "/tmp/tmp6loc8rar.multi.cool::/8" INFO:cooler.create:Writing chroms INFO:cooler.create:Writing bins INFO:cooler.create:Writing pixels INFO:cooler.create:Writing indexes INFO:cooler.create:Writing info INFO:cooler.create:Done INFO:cooler.create:Merging into ICRF-12min-S2-R2galGal6.galGal6.no_filter.10000.cool INFO:cooler.create:Creating cooler at "ICRF-12min-S2-R2__galGal6.galGal6.no_filter.10000.cool::/" INFO:cooler.create:Writing chroms INFO:cooler.create:Writing bins INFO:cooler.create:Writing pixels INFO:cooler.reduce:nnzs: [0, 0, 0, 0, 0, 0, 0, 0, 0] INFO:cooler.reduce:current: [0, 0, 0, 0, 0, 0, 0, 0, 0] Traceback (most recent call last): File "/miniconda3/bin/cooler", line 11, in
sys.exit(cli())
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 764, in call
return self.main(args, kwargs)
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/miniconda3/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(args, kwargs)
File "/miniconda3/lib/python3.6/site-packages/cooler/cli/cload.py", line 476, in pairs
h5opts=h5opts,
File "/miniconda3/lib/python3.6/site-packages/cooler/create/_create.py", line 670, in create_from_unordered
kwargs)
File "/miniconda3/lib/python3.6/site-packages/cooler/create/_create.py", line 565, in create
file_path, target, meta.columns, iterable, h5opts, lock)
File "/miniconda3/lib/python3.6/site-packages/cooler/create/_create.py", line 204, in write_pixels
for i, chunk in enumerate(iterable):
File "/miniconda3/lib/python3.6/site-packages/cooler/reduce.py", line 162, in iter
ignore_index=True)
File "/miniconda3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 225, in concat
copy=copy, sort=sort)
File "/miniconda3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 259, in init__
raise ValueError('No objects to concatenate')
ValueError: No objects to concatenate
Work dir: /nl/umw_job_dekker/users/jg14w/Mapping/ICRF-12m-R2_galGal6/work/56/efd5ad506debb426aa33922a1b1abe
Tip: view the complete command output by changing to the process work dir and entering the command
cat .command.out
-- Check '.nextflow.log' file for details WARN: Killing pending tasks (1)
Sender: LSF System lsfadmin@c04b04 Subject: Job 2694432: <~/SSH_plumbing/nextflow run dekkerlab/distiller-nf -r ghpcc -params-file ICRF-12m-R2_galGal6.yml -profile custom --container_cache_dir /nl/umw_job_dekker/cshare/containers --custom_config /nl/umw_job_dekker/users/jg14w/Mapping/cluster.config> in cluster Exited
Job <~/SSH_plumbing/nextflow run dekkerlab/distiller-nf -r ghpcc -params-file ICRF-12m-R2_galGal6.yml -profile custom --container_cache_dir /nl/umw_job_dekker/cshare/containers --custom_config /nl/umw_job_dekker/users/jg14w/Mapping/cluster.config> was submitted from host by user in cluster at Thu Mar 28 19:49:37 2019.
Job was executed on host(s) <2*c04b04>, in queue , as user in cluster at Thu Mar 28 19:49:37 2019.
</home/jg14w> was used as the home directory.
</nl/umw_job_dekker/users/jg14w/Mapping/ICRF-12m-R2_galGal6> was used as the working directory.
Started at Thu Mar 28 19:49:37 2019.
Terminated at Fri Mar 29 03:55:45 2019.
Results reported at Fri Mar 29 03:55:45 2019.
Your job looked like:
LSBATCH: User input ~/SSH_plumbing/nextflow run dekkerlab/distiller-nf -r ghpcc -params-file ICRF-12m-R2_galGal6.yml -profile custom --container_cache_dir /nl/umw_job_dekker/cshare/containers --custom_config /nl/umw_job_dekker/users/jg14w/Mapping/cluster.config
Exited with exit code 1.
Resource usage summary:
The output (if any) is above this job summary.
PS:
Read file for stderr output of this job.
Contents of: /nl/umw_job_dekker/cshare/reference/sorted_chromsizes/galGal6.reduced.chrom.sizes:
chr1 197608386 chr2 149682049 chr3 110838418 chr4 91315245 chr5 59809098 chr6 36374701 chr7 36742308 chr8 30219446 chr9 24153086 chr10 21119840 chr11 20200042 chr12 20387278 chr13 19166714 chr14 16219308 chr15 13062184 chr16 2844601 chr17 10762512 chr18 11373140 chr19 10323212 chr20 13897287 chr21 6844979 chr22 5459462 chr23 6149580 chr24 6491222 chr25 3980610 chr26 6055710 chr27 8080432 chr28 5116882 chr30 1818525 chr31 6153034 chr32 725831 chr33 7821666 chrM 16784 chrW 6813114 chrZ 82529921
Pairs file:
/nl/umw_job_dekker/users/jg14w/Mapping/ICRF-12m-R2_galGal6/work/17/3f266ec32d5602fe6f19069856e46b/ICRF-12min-S2-R2__galGal6.galGal6.nodups.pairs.gz
pairs format v1.0.0
sorted: chr1-chr2-pos1-pos2
shape: upper triangle
genome_assembly: unknown
chromsize: ref|NC_001323.1| 16775
chromsize: ref|NC_006088.5| 197608386
chromsize: ref|NC_006089.5| 149682049
chromsize: ref|NC_006090.5| 110838418
chromsize: ref|NC_006091.5| 91315245
chromsize: ref|NC_006092.5| 59809098
chromsize: ref|NC_006093.5| 36374701
chromsize: ref|NC_006094.5| 36742308
chromsize: ref|NC_006095.5| 30219446
chromsize: ref|NC_006096.5| 24153086
chromsize: ref|NC_006097.5| 21119840
chromsize: ref|NC_006098.5| 20200042
chromsize: ref|NC_006099.5| 20387278
chromsize: ref|NC_006100.5| 19166714
chromsize: ref|NC_006101.5| 16219308
chromsize: ref|NC_006102.5| 13062184
chromsize: ref|NC_006103.5| 2844601
chromsize: ref|NC_006104.5| 10762512
chromsize: ref|NC_006105.5| 11373140
chromsize: ref|NC_006106.5| 10323212
chromsize: ref|NC_006107.5| 13897287
chromsize: ref|NC_006108.5| 6844979
chromsize: ref|NC_006109.5| 5459462
chromsize: ref|NC_006110.5| 6149580
chromsize: ref|NC_006111.5| 6491222
chromsize: ref|NC_006112.4| 3980610
chromsize: ref|NC_006113.5| 6055710
chromsize: ref|NC_006114.5| 8080432
chromsize: ref|NC_006115.5| 5116882
chromsize: ref|NC_006119.4| 725831
chromsize: ref|NC_006126.5| 6813114
chromsize: ref|NC_006127.5| 82529921
chromsize: ref|NC_008465.4| 7821666
chromsize: ref|NC_028739.2| 1818525
chromsize: ref|NC_028740.2| 6153034
samheader: @SQ SN:ref|NC_006088.5| LN:197608386
samheader: @SQ SN:ref|NC_006089.5| LN:149682049
samheader: @SQ SN:ref|NC_006090.5| LN:110838418
samheader: @SQ SN:ref|NC_006091.5| LN:91315245
samheader: @SQ SN:ref|NC_006092.5| LN:59809098
samheader: @SQ SN:ref|NC_006093.5| LN:36374701
samheader: @SQ SN:ref|NC_006094.5| LN:36742308
samheader: @SQ SN:ref|NC_006095.5| LN:30219446
samheader: @SQ SN:ref|NC_006096.5| LN:24153086
samheader: @SQ SN:ref|NC_006097.5| LN:21119840
samheader: @SQ SN:ref|NC_006098.5| LN:20200042