dputhier / pygtftk

A python package and a set of shell commands to handle GTF files
GNU General Public License v3.0
44 stars 6 forks source link

Documentation failed due to ologram issue #160

Closed dputhier closed 3 years ago

dputhier commented 3 years ago
gtftk select_by_key -i mini_real.gtf.gz -k gene_biotype -v protein_coding,lincRNA,antisense,processed_transcript  |  gtftk ologram  -m gene_biotype -p ENCFF112BHN_H3K4me3_K562_sub.bed -c hg38 -D -n  -pf example_pa_02.pdf -k 8 -j summed_bp_overlaps_pvalue
 |-- 11:39-WARNING-ologram : Using only 8 threads, but 16 cores are available. Consider changing the --nb-threads parameter.
Traceback (most recent call last):
  File "/Users/puthier/anaconda3/bin/gtftk", line 4, in <module>
    __import__('pkg_resources').run_script('pygtftk==1.2.8', 'gtftk')
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 665, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1463, in run_script
    exec(code, namespace, namespace)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 110, in <module>
    args = main()
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 94, in main
    CmdManager.run(args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/cmd_manager.py", line 955, in run
    fun(**args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 1081, in ologram
    plot_results(d, data_file, pdf_file, pdf_width, pdf_height, feature_order, more_bed_multiple_overlap,
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 1474, in plot_results
    plot_process.start()
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'as_labeller.<locals>._labeller'
dputhier commented 3 years ago

@qferre A more detailled output.

gtftk select_by_key -i mini_real.gtf.gz -k gene_biotype -v protein_coding,lincRNA,antisense,processed_transcript  |  gtftk ologram  -m gene_biotype -p ENCFF112BHN_H3K4me3_K562_sub.bed -c hg38 -D -n  -pf example_pa_02.pdf -k 8 -j summed_bp_overlaps_pvalue -V 3
 |-- 11:46:37-INFO : Checking BED file format (ENCFF112BHN_H3K4me3_K562_sub.bed).
 |-- 11:46:39-WARNING-ologram : Using only 8 threads, but 16 cores are available. Consider changing the --nb-threads parameter.
 |-- 11:46:39-INFO-ologram : Checking chromosome info file.
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr1 (size: 248956422) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr2 (size: 242193529) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr3 (size: 198295559) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr4 (size: 190214555) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr5 (size: 181538259) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr6 (size: 170805979) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr7 (size: 159345973) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chrX (size: 156040895) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr8 (size: 145138636) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr9 (size: 138394717) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr11 (size: 135086622) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr10 (size: 133797422) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr12 (size: 133275309) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr13 (size: 114364328) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr14 (size: 107043718) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr15 (size: 101991189) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr16 (size: 90338345) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr17 (size: 83257441) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr18 (size: 80373285) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr20 (size: 64444167) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr19 (size: 58617616) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chrY (size: 57227415) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr22 (size: 50818468) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-DEBUG-ologram : Found chromosome chr21 (size: 46709983) in /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/chromsize_pygtftk_h1l2row0.txt)
 |-- 11:46:39-INFO-ologram : Instantiating a GTF.
 |-- 11:46:40-DEBUG-ologram : Calling get_attr_list.
 |-- 11:46:40-DEBUG_MEM-ologram : GTF created  (#l=137156, p=0x7f8788e46a00, f=-, i=140220137803104, n=1).
 |-- 11:46:40-DEBUG-ologram : Calling 'get_chroms'.
 |-- 11:46:40-DEBUG-ologram : Calling select_by_key (key=seqid, value=chr1,chr2,chr3,chr4...).
 |-- 11:46:40-INFO-ologram : Instantiating a GTF.
 |-- 11:46:40-DEBUG-ologram : Calling get_attr_list.
 |-- 11:46:40-DEBUG_MEM-ologram : GTF created  (#l=137156, p=0x7f876f94c850, f=-, i=140220137849712, n=2).
 |-- 11:46:40-DEBUG_MEM-ologram : GTF deleted  (#l=137156, p=0x7f8788e46a00, f=-, i=140220137803104, n=1).
 |-- 11:46:40-DEBUG-ologram : Calling extract_data (gene_biotype).
 |-- 11:46:41-DEBUG-ologram : Calling select_by_key (key=gene_biotype, value=antisense).
 |-- 11:46:41-INFO-ologram : Instantiating a GTF.
 |-- 11:46:41-DEBUG-ologram : Calling get_attr_list.
 |-- 11:46:41-DEBUG_MEM-ologram : GTF created  (#l=1108, p=0x7f8789e064b0, f=-, i=140220137849664, n=3).
 |-- 11:46:41-DEBUG-ologram : Calling 'to_bed' method.
 |-- 11:46:41-DEBUG-ologram : Calling extract_data (seqid,start,end,score,strand,transcript_id,gene_id,exon_id).
 |-- 11:46:46-DEBUG_MEM-ologram : GTF deleted  (#l=1108, p=0x7f8789e064b0, f=-, i=140220137849664, n=3).
 |-- 11:46:46-INFO-ologram : Beginning the computation of overlap stats for gene_biotype:antisense
 |-- 11:46:46-DEBUG-ologram : BedA: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.3_jqxbgy.tmp
 |-- 11:46:46-DEBUG-ologram : Nb. features in BedA: 2043
 |-- 11:46:46-DEBUG-ologram : BedB: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools._6l3dpsc.tmp
 |-- 11:46:46-DEBUG-ologram : Nb. features in BedB: 99
 |-- 11:46:46-DEBUG-ologram : BATCHES : 10 batches of 20 shuffles.
 |-- 11:46:46-DEBUG-ologram : Total number of shuffles : 200.
 |-- 11:46:46-DEBUG-ologram : NB_THREADS = 8.
 |-- 11:46:46-DEBUG-ologram : BED files merged and sorted via PyBedtools in 0.1259138584136963 s
 |-- 11:46:46-DEBUG-ologram : BED files read as lists of intervals in 0.08283114433288574 s
 |-- 11:46:46-INFO-ologram : We will perform a total of 10batches of shufflings.
 |-- 11:46:57-DEBUG-ologram : Total number of shuffles, reminder : 200
 |-- 11:46:57-DEBUG-ologram : Number of intersections in the first shuffle, for comparison : 3
 |-- 11:46:57-INFO-ologram : All intersections have been generated.
 |-- 11:46:57-DEBUG-ologram : Processing overlaps for gene_biotype:antisense
 |-- 11:46:57-DEBUG-ologram : gene_biotype:antisense- Statistics on overlaps computed in : 0.0012142658233642578 s.
 |-- 11:46:57-DEBUG-ologram : gene_biotype:antisense- Negative Binomial distributions fitted in: 0.24189114570617676 s.
 |-- 11:46:57-INFO-ologram : --- Total time : 11.213157892227173 s for feature(s) : gene_biotype:antisense ---
 |-- 11:46:57-DEBUG-ologram : Total time does not include BED reading, as it does not scale with batch size.
 |-- 11:46:57-INFO-ologram : Processing gene_biotype:antisense
 |-- 11:46:57-DEBUG-ologram : Calling select_by_key (key=gene_biotype, value=lincRNA).
 |-- 11:46:57-INFO-ologram : Instantiating a GTF.
 |-- 11:46:57-DEBUG-ologram : Calling get_attr_list.
 |-- 11:46:57-DEBUG_MEM-ologram : GTF created  (#l=1134, p=0x7f878bc062e0, f=-, i=140220135062784, n=4).
 |-- 11:46:57-DEBUG-ologram : Calling 'to_bed' method.
 |-- 11:46:57-DEBUG-ologram : Calling extract_data (seqid,start,end,score,strand,transcript_id,gene_id,exon_id).
 |-- 11:46:57-DEBUG_MEM-ologram : GTF deleted  (#l=1134, p=0x7f878bc062e0, f=-, i=140220135062784, n=4).
 |-- 11:46:57-INFO-ologram : Beginning the computation of overlap stats for gene_biotype:lincRNA
 |-- 11:46:57-DEBUG-ologram : BedA: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.3_jqxbgy.tmp
 |-- 11:46:57-DEBUG-ologram : Nb. features in BedA: 2043
 |-- 11:46:57-DEBUG-ologram : BedB: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.5xo0iq54.tmp
 |-- 11:46:57-DEBUG-ologram : Nb. features in BedB: 71
 |-- 11:46:57-DEBUG-ologram : BATCHES : 10 batches of 20 shuffles.
 |-- 11:46:57-DEBUG-ologram : Total number of shuffles : 200.
 |-- 11:46:57-DEBUG-ologram : NB_THREADS = 8.
 |-- 11:46:57-DEBUG-ologram : BED files merged and sorted via PyBedtools in 0.09289908409118652 s
 |-- 11:46:58-DEBUG-ologram : BED files read as lists of intervals in 0.08975815773010254 s
 |-- 11:46:58-INFO-ologram : We will perform a total of 10batches of shufflings.
 |-- 11:47:07-DEBUG-ologram : Total number of shuffles, reminder : 200
 |-- 11:47:07-DEBUG-ologram : Number of intersections in the first shuffle, for comparison : 10
 |-- 11:47:07-INFO-ologram : All intersections have been generated.
 |-- 11:47:07-DEBUG-ologram : Processing overlaps for gene_biotype:lincRNA
 |-- 11:47:07-DEBUG-ologram : gene_biotype:lincRNA- Statistics on overlaps computed in : 0.0014841556549072266 s.
 |-- 11:47:08-DEBUG-ologram : gene_biotype:lincRNA- Negative Binomial distributions fitted in: 0.23220610618591309 s.
 |-- 11:47:08-INFO-ologram : --- Total time : 10.189892292022705 s for feature(s) : gene_biotype:lincRNA ---
 |-- 11:47:08-DEBUG-ologram : Total time does not include BED reading, as it does not scale with batch size.
 |-- 11:47:08-INFO-ologram : Processing gene_biotype:lincRNA
 |-- 11:47:08-DEBUG-ologram : Calling select_by_key (key=gene_biotype, value=processed_transcript).
 |-- 11:47:08-INFO-ologram : Instantiating a GTF.
 |-- 11:47:08-DEBUG-ologram : Calling get_attr_list.
 |-- 11:47:08-DEBUG_MEM-ologram : GTF created  (#l=513, p=0x7f8788ec6a00, f=-, i=140220134679168, n=5).
 |-- 11:47:08-DEBUG-ologram : Calling 'to_bed' method.
 |-- 11:47:08-DEBUG-ologram : Calling extract_data (seqid,start,end,score,strand,transcript_id,gene_id,exon_id).
 |-- 11:47:08-DEBUG_MEM-ologram : GTF deleted  (#l=0, p=0x7f8788ec6a00, f=-, i=140220134679168, n=5).
 |-- 11:47:08-INFO-ologram : Beginning the computation of overlap stats for gene_biotype:processed_transcript
 |-- 11:47:08-DEBUG-ologram : BedA: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.3_jqxbgy.tmp
 |-- 11:47:08-DEBUG-ologram : Nb. features in BedA: 2043
 |-- 11:47:08-DEBUG-ologram : BedB: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.xkgo3zhr.tmp
 |-- 11:47:08-DEBUG-ologram : Nb. features in BedB: 16
 |-- 11:47:08-DEBUG-ologram : BATCHES : 10 batches of 20 shuffles.
 |-- 11:47:08-DEBUG-ologram : Total number of shuffles : 200.
 |-- 11:47:08-DEBUG-ologram : NB_THREADS = 8.
 |-- 11:47:08-DEBUG-ologram : BED files merged and sorted via PyBedtools in 0.09621977806091309 s
 |-- 11:47:08-DEBUG-ologram : BED files read as lists of intervals in 0.08512163162231445 s
 |-- 11:47:08-INFO-ologram : We will perform a total of 10batches of shufflings.
 |-- 11:47:16-DEBUG-ologram : Total number of shuffles, reminder : 200
 |-- 11:47:16-DEBUG-ologram : Number of intersections in the first shuffle, for comparison : 0
 |-- 11:47:16-INFO-ologram : All intersections have been generated.
 |-- 11:47:16-DEBUG-ologram : Processing overlaps for gene_biotype:processed_transcript
 |-- 11:47:16-DEBUG-ologram : gene_biotype:processed_transcript- Statistics on overlaps computed in : 0.0009810924530029297 s.
 |-- 11:47:17-DEBUG-ologram : gene_biotype:processed_transcript- Negative Binomial distributions fitted in: 0.6484692096710205 s.
 |-- 11:47:17-INFO-ologram : --- Total time : 8.919640064239502 s for feature(s) : gene_biotype:processed_transcript ---
 |-- 11:47:17-DEBUG-ologram : Total time does not include BED reading, as it does not scale with batch size.
 |-- 11:47:17-INFO-ologram : Processing gene_biotype:processed_transcript
 |-- 11:47:17-DEBUG-ologram : Calling select_by_key (key=gene_biotype, value=protein_coding).
 |-- 11:47:17-INFO-ologram : Instantiating a GTF.
 |-- 11:47:17-DEBUG-ologram : Calling get_attr_list.
 |-- 11:47:17-DEBUG_MEM-ologram : GTF created  (#l=134401, p=0x7f877eee5ca0, f=-, i=140220135062784, n=6).
 |-- 11:47:17-DEBUG-ologram : Calling 'to_bed' method.
 |-- 11:47:17-DEBUG-ologram : Calling extract_data (seqid,start,end,score,strand,transcript_id,gene_id,exon_id).
 |-- 11:47:21-DEBUG_MEM-ologram : GTF deleted  (#l=134401, p=0x7f877eee5ca0, f=-, i=140220135062784, n=6).
 |-- 11:47:21-INFO-ologram : Beginning the computation of overlap stats for gene_biotype:protein_coding
 |-- 11:47:21-DEBUG-ologram : BedA: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.3_jqxbgy.tmp
 |-- 11:47:21-DEBUG-ologram : Nb. features in BedA: 2043
 |-- 11:47:21-DEBUG-ologram : BedB: /var/folders/zy/wl3dj2_n76zfc8sdvny1q06c0000gn/T/pybedtools.8sgawsgd.tmp
 |-- 11:47:21-DEBUG-ologram : Nb. features in BedB: 749
 |-- 11:47:21-DEBUG-ologram : BATCHES : 10 batches of 20 shuffles.
 |-- 11:47:21-DEBUG-ologram : Total number of shuffles : 200.
 |-- 11:47:21-DEBUG-ologram : NB_THREADS = 8.
 |-- 11:47:21-DEBUG-ologram : BED files merged and sorted via PyBedtools in 0.10446476936340332 s
 |-- 11:47:21-DEBUG-ologram : BED files read as lists of intervals in 0.11273503303527832 s
 |-- 11:47:21-INFO-ologram : We will perform a total of 10batches of shufflings.
 |-- 11:47:32-DEBUG-ologram : Total number of shuffles, reminder : 200
 |-- 11:47:32-DEBUG-ologram : Number of intersections in the first shuffle, for comparison : 93
 |-- 11:47:32-INFO-ologram : All intersections have been generated.
 |-- 11:47:32-DEBUG-ologram : Processing overlaps for gene_biotype:protein_coding
 |-- 11:47:32-DEBUG-ologram : gene_biotype:protein_coding- Statistics on overlaps computed in : 0.018545866012573242 s.
 |-- 11:47:32-DEBUG-ologram : gene_biotype:protein_coding- Negative Binomial distributions fitted in: 0.0693519115447998 s.
 |-- 11:47:32-INFO-ologram : --- Total time : 10.88084888458252 s for feature(s) : gene_biotype:protein_coding ---
 |-- 11:47:32-DEBUG-ologram : Total time does not include BED reading, as it does not scale with batch size.
 |-- 11:47:32-INFO-ologram : Processing gene_biotype:protein_coding
 |-- 11:47:32-INFO-ologram : Adding bar plot.
 |-- 11:47:32-INFO-ologram : Page width set to 2.4
 |-- 11:47:32-INFO-ologram : Page height set to 5
 |-- 11:47:32-INFO-ologram : Saving diagram to file : example_pa_02.pdf
 |-- 11:47:32-INFO-ologram : Please be patient. Drawing may be long (minutes) for large numbers of datasets/combinations.
Traceback (most recent call last):
  File "/Users/puthier/anaconda3/bin/gtftk", line 4, in <module>
    __import__('pkg_resources').run_script('pygtftk==1.2.8', 'gtftk')
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 665, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1463, in run_script
    exec(code, namespace, namespace)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 110, in <module>
    args = main()
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 94, in main
    CmdManager.run(args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/cmd_manager.py", line 955, in run
    fun(**args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 1081, in ologram
    plot_results(d, data_file, pdf_file, pdf_width, pdf_height, feature_order, more_bed_multiple_overlap,
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 1474, in plot_results
    plot_process.start()
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/puthier/anaconda3/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'as_labeller.<locals>._labeller'
 |-- 11:47:33-DEBUG_MEM-ologram : GTF deleted  (#l=0, p=0x7f876f94c850, f=-, i=140220137849712, n=2).
dputhier commented 3 years ago

Fixed with

 import billiard as multiprocessing.
qferre commented 3 years ago

I think I know what caused this. It's a simple, silly bug due to the fact that I delegate the drawing of the graph to a subprocess so we can assign a maximum time.

It's surprising that it has not happened before, though. Does your latest commit fix the issue ?

qferre commented 3 years ago

Whoops, sorry, I posted my answer before I saw yours. Out of curiosity, what made you choose to use billiard specifically ?

dputhier commented 3 years ago

Because I had 3 issue (2 not reported) with multiprocessing. Everything was fixed using the magic (?) of billiard. Release in progress. Best

dputhier commented 3 years ago

@qferre

Un autre problème surgit dans la doc (peut être lié à import billiard as multiprocessing dans ologram.py ?)

|-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.bed). |-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.1.bed). |-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.2.bed). |-- 12:24-WARNING-ologram : Using only 1 threads, but 16 cores are available. Consider changing the --nb-threads parameter. |-- 12:24-WARNING-ologram : --more-bed-labels was not set, automatically defaulting to --more-bed file names. concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_compute.py", line 547, in index_all_these all_combis = np.frombuffer(shared_array_all_combis_base, dtype=ctypes.c_uint) NameError: name 'shared_array_all_combis_base' is not defined """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/Users/puthier/anaconda3/bin/gtftk", line 4, in import('pkg_resources').run_script('pygtftk==1.2.8', 'gtftk') File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/init.py", line 665, in run_script self.require(requires)[0].run_script(script_name, ns) File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/init.py", line 1463, in run_script exec(code, namespace, namespace) File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 110, in args = main() File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 94, in main CmdManager.run(args) File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/cmd_manager.py", line 955, in run fun(**args) File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 982, in ologram hits['multiple_beds'] = overlap_partial(bedA=peak_file, bedsB=all_more_beds, File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_shuffling.py", line 492, in compute_overlap_stats result = osc.stats_multiple_overlap(all_overlaps=all_intersections, File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_compute.py", line 1159, in stats_multiple_overlap mappings.append(future.result()) File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 432, in result return self.get_result() File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 388, in get_result raise self._exception NameError: name 'shared_array_all_combis_base' is not defined

dputhier commented 3 years ago

With the command....

$ gtftk ologram -z -w -q -c simple_07.chromInfo -p simple_07_peaks.bed --more-bed simple_07_peaks.1.bed simple_07_peaks.2.bed --more-bed-multiple-overlap
 |-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.bed).
 |-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.1.bed).
 |-- 12:24-WARNING : Converting to bed6 format (simple_07_peaks.2.bed).
 |-- 12:24-WARNING-ologram : Using only 1 threads, but 16 cores are available. Consider changing the --nb-threads parameter.
 |-- 12:24-WARNING-ologram : --more-bed-labels was not set, automatically defaulting to --more-bed file names.
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_compute.py", line 547, in index_all_these
    all_combis = np.frombuffer(shared_array_all_combis_base, dtype=ctypes.c_uint)
NameError: name 'shared_array_all_combis_base' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/puthier/anaconda3/bin/gtftk", line 4, in <module>
    __import__('pkg_resources').run_script('pygtftk==1.2.8', 'gtftk')
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 665, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1463, in run_script
    exec(code, namespace, namespace)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 110, in <module>
    args = main()
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/EGG-INFO/scripts/gtftk", line 94, in main
    CmdManager.run(args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/cmd_manager.py", line 955, in run
    fun(**args)
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/plugins/ologram.py", line 982, in ologram
    hits['multiple_beds'] = overlap_partial(bedA=peak_file, bedsB=all_more_beds,
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_shuffling.py", line 492, in compute_overlap_stats
    result = osc.stats_multiple_overlap(all_overlaps=all_intersections,
  File "/Users/puthier/anaconda3/lib/python3.8/site-packages/pygtftk-1.2.8-py3.8-macosx-10.9-x86_64.egg/pygtftk/stats/intersect/overlap_stats_compute.py", line 1159, in stats_multiple_overlap
    mappings.append(future.result())
  File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/Users/puthier/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
NameError: name 'shared_array_all_combis_base' is not defined
qferre commented 3 years ago

Yes, it's likely the fault of billard. The problem arises when trying to multiprocess a function called index_all_these which is not a pure function (meaning it depends on variables defined elsewhere in the code). In the general multiprocessing, it works fine. Maybe in billard it does not ?

I'll try to look into it today.

qferre commented 3 years ago

I can confirm the error does not occur on the current develop branch (before using billiard) so it's likely the culprit. Investigating.

qferre commented 3 years ago

I cannot reproduce the error on my system, even when using the release/v.1.2.8 branch. What version of Python are you using ? Mine is 3.8.8

qferre commented 3 years ago

@dputhier Could you try to run the faulty ologram command on the version which is on the develop branch (without billiard) ? If that works, I believe we can fix all this by not making "multiprocessing" an alias of billiard but calling billiard explicitly in ologram.py. This way subsequent imports will still import "multiprocessing".

Otherwise, I don't really know what to do, since everything works fine on my machine (Xubuntu 20, Python 3.8.8) ...

qferre commented 3 years ago

If it still does not work, I may be able to jury-rig something by making the "index_all_these" function pure. But I need to know exactly what causes the problem ASAP, and we will need to do it in a visioconference call since I cannot test this on my machine (since everything works fine for me)

dputhier commented 3 years ago

We can check this now by visio.

Le mar. 15 juin 2021 à 13:43, Quentin Ferré @.***> a écrit :

If it still does not work, I may be able to jury-rig something by making the "index_all_these" function pure. But I need to know exactly what causes the problem ASAP, and we will need to do it in a visioconference call since I cannot test this on my machine (since everything works fine for me)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dputhier/pygtftk/issues/160#issuecomment-861427481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN7CHXGSOFIXIW432TCJNTTS44GPANCNFSM46W4X5VQ .

--

====================================================================

Denis Puthier - Maître de Conférences

Deputy director for Education of MarMaRa Institute Co-head of TGML (Transcriptomics & Genomics Platform Marseille Luminy)

laboratoire INSERM TAGC/INSERM U 1090 Parc Scientifique de Luminy case 928 163, avenue de Luminy 13288 MARSEILLE cedex 09 FRANCE Mail: @.*** Tel: (National) 04 91 82 87 31 / (International) 33 4 91 82 87 31 Fax: (National) 04 91 82 87 01 / (International) 33 4 91 82 87 01

Web: https://tagc.univ-amu.fr/en/user/645

====================================================================

qferre commented 3 years ago

@dputhier : Can you send me a visio link ? I am available for 30 minutes.