sjroth / ARTDeco

MIT License
15 stars 7 forks source link

Error running get_dogs #4

Closed maxchang closed 2 years ago

maxchang commented 5 years ago
Running get_dogs mode...
GTF file exists...
Gene annotation files exist...
Inferring BAM file formats...
All BAM files are Paired-End, Strand-specific, and Reverse-strand oriented...
All tag directories exist...
Finding DoGs...
Get genes with potential DoGs with minimum length of 4000 bp...
Generate initial screening BED file for DoGs with minimum length 4000 bp and window size 500 bp...
/home/mwchang/gpfs/dev/ARTDeco/ARTDeco/get_dogs.py:97: UserWarning: DataFrame columns are not unique, some columns will be omitted.
  downstream_stop_dict = downstream_stop_df.set_index('Name').T.to_dict('list')
Initial screening coverage for DoGs with minimum length of 4000 bp...
Generate screening BED file for pre-screened DoGs...
Screening coverage for pre-screened DoGs...
Discovering DoG coordinates for pre-screened DoGs and output BED files...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/mwchang/gpfs/miniconda3/envs/ARTDeco/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/mwchang/gpfs/miniconda3/envs/ARTDeco/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/mwchang/gpfs/dev/ARTDeco/ARTDeco/get_dogs.py", line 421, in get_all_dog_coordinates
    if overlaps[0].value in overlapping_genes:
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mwchang/gpfs/miniconda3/envs/ARTDeco/bin/ARTDeco", line 11, in <module>
    load_entry_point('ARTDeco==0.2', 'console_scripts', 'ARTDeco')()
  File "/home/mwchang/gpfs/dev/ARTDeco/ARTDeco/main.py", line 508, in main
    os.path.join(args.home_dir,'dogs'))
  File "/home/mwchang/gpfs/dev/ARTDeco/ARTDeco/get_dogs.py", line 463, in get_multi_dog_beds
    dog_dfs = pool.map(get_all_dog_coordinates,cmds)
  File "/home/mwchang/gpfs/miniconda3/envs/ARTDeco/lib/python3.6/multiprocessing/pool.py", line 288, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/mwchang/gpfs/miniconda3/envs/ARTDeco/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
IndexError: list index out of range
sjroth commented 5 years ago

Can you provide the command you used and the location of the input files?

maxchang commented 5 years ago

Files are in /home/mwchang/gpfs/projects/rna-seq/fluomics/160406-cell-lines/artdeco_GRCh38

The command was ARTDeco -mode get_dogs -gtf-file modified_genes.gtf -chrom-sizes-file chrom.sizes

Running diff_exp_read_in with the same files seemed to work.

sjroth commented 5 years ago

Hmmm... I will investigate. Thanks!

sjroth commented 5 years ago

I do not have access to this directory. Can you copy this directory to /gpfs/data01/bennerlab/home/sjroth?

maxchang commented 5 years ago

Sorry, that was the non-canonical directory. Try /gpfs/data01/bennerlab/home/mchang/projects/rna-seq/fluomics/160406-cell-lines/artdeco_GRCh38

sjroth commented 5 years ago

Got it! Will run debugging now! Thanks!

sjroth commented 5 years ago

I was unable to reproduce the error. It seems like it may be an issue with the GTF file (given the placement of the warnings). What was the original GTF that you were using?