aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
167 stars 27 forks source link

Assertion Error running wrapper: run_pycistarget, with custom annotation #60

Closed SidG13 closed 1 year ago

SidG13 commented 1 year ago

Describe the bug I'm getting a KeyError when running the wrapper function. The function works well up until the 'Creating contrast groups' in the motif_enrichment_dem.py script. I ran into this issue when using @Goultard59's original PR. When doing a bit of my own line-by-line troubleshooting, the fg_pr object referenced as the point where the script stops working is created properly, so I'm not entirely sure what exactly is causing this issue.

To Reproduce

from scenicplus.wrappers.run_pycistarget import run_pycistarget
import pandas as pd 

custom_annot = pd.read_csv('/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/TSS_annot.txt', header=0, sep='\t')

rankings_db = '/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/for_cistarget/Crotalus_viridis_v1_consensuspeaks.regions_vs_motifs.rankings.feather'
scores_db =  '/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/for_cistarget/Crotalus_viridis_v1_consensuspeaks.regions_vs_motifs.scores.feather'
motif_annotation = '/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/for_cistarget/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0_just_jaspar2.tbl'

run_pycistarget(
    region_sets = region_sets,
    species = 'custom',
    save_path = os.path.join(work_dir, 'motifs'),
    custom_annot = custom_annot,
    ctx_db_path = rankings_db,
    dem_db_path = scores_db,
    path_to_motif_annotations = motif_annotation,
    run_without_promoters = False,
    n_cpu = 4,
    _temp_dir = os.path.join(tmp_dir, 'ray_spill'),
    annotation_version = 'v1',
    )

Error output

2022-11-02 13:34:47,544 pycisTarget_wrapper INFO     /home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data/motifs folder already exists.
2022-11-02 13:34:47,552 pycisTarget_wrapper INFO     Loading cisTarget database for topics_otsu
2022-11-02 13:34:47,552 cisTarget    INFO     Reading cisTarget database
2022-11-02 13:35:01,673 pycisTarget_wrapper INFO     Running cisTarget for topics_otsu
2022-11-02 13:35:05,448 INFO worker.py:1509 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265 
(ctx_internal_ray pid=25007) 2022-11-02 13:35:06,881 cisTarget    INFO     Running cisTarget for Topic1 which has 2868 regions
(ctx_internal_ray pid=25006) 2022-11-02 13:35:06,953 cisTarget    INFO     Running cisTarget for Topic2 which has 23 regions
(ctx_internal_ray pid=25005) 2022-11-02 13:35:07,051 cisTarget    INFO     Running cisTarget for Topic3 which has 3338 regions
(ctx_internal_ray pid=25008) 2022-11-02 13:35:07,091 cisTarget    INFO     Running cisTarget for Topic4 which has 2092 regions
(ctx_internal_ray pid=25007) 2022-11-02 13:35:10,787 cisTarget    INFO     Annotating motifs for Topic1
(ctx_internal_ray pid=25006) 2022-11-02 13:35:10,848 cisTarget    INFO     Annotating motifs for Topic2
(ctx_internal_ray pid=25005) 2022-11-02 13:35:10,865 cisTarget    INFO     Annotating motifs for Topic3
(ctx_internal_ray pid=25007) 2022-11-02 13:35:11,005 cisTarget    INFO     Getting cistromes for Topic1
(ctx_internal_ray pid=25006) 2022-11-02 13:35:11,001 cisTarget    INFO     Getting cistromes for Topic2
(ctx_internal_ray pid=25008) 2022-11-02 13:35:10,945 cisTarget    INFO     Annotating motifs for Topic4
(ctx_internal_ray pid=25007) 2022-11-02 13:35:11,052 cisTarget    INFO     Running cisTarget for Topic5 which has 1916 regions
(ctx_internal_ray pid=25006) 2022-11-02 13:35:11,077 cisTarget    INFO     Running cisTarget for Topic6 which has 1932 regions
(ctx_internal_ray pid=25008) 2022-11-02 13:35:11,071 cisTarget    INFO     Getting cistromes for Topic4
(ctx_internal_ray pid=25008) 2022-11-02 13:35:11,112 cisTarget    INFO     Running cisTarget for Topic7 which has 3149 regions
(ctx_internal_ray pid=25005) 2022-11-02 13:35:11,083 cisTarget    INFO     Getting cistromes for Topic3
(ctx_internal_ray pid=25005) 2022-11-02 13:35:11,201 cisTarget    INFO     Running cisTarget for Topic8 which has 1857 regions
(ctx_internal_ray pid=25006) 2022-11-02 13:35:11,575 cisTarget    INFO     Annotating motifs for Topic6
(ctx_internal_ray pid=25007) 2022-11-02 13:35:11,661 cisTarget    INFO     Annotating motifs for Topic5
(ctx_internal_ray pid=25006) 2022-11-02 13:35:11,718 cisTarget    INFO     Getting cistromes for Topic6
(ctx_internal_ray pid=25006) 2022-11-02 13:35:11,760 cisTarget    INFO     Running cisTarget for Topic9 which has 1328 regions
(ctx_internal_ray pid=25008) 2022-11-02 13:35:11,784 cisTarget    INFO     Annotating motifs for Topic7
(ctx_internal_ray pid=25005) 2022-11-02 13:35:11,789 cisTarget    INFO     Annotating motifs for Topic8
(ctx_internal_ray pid=25007) 2022-11-02 13:35:11,851 cisTarget    INFO     Getting cistromes for Topic5
(ctx_internal_ray pid=25005) 2022-11-02 13:35:11,924 cisTarget    INFO     Getting cistromes for Topic8
(ctx_internal_ray pid=25007) 2022-11-02 13:35:11,933 cisTarget    INFO     Running cisTarget for Topic10 which has 14 regions
(ctx_internal_ray pid=25008) 2022-11-02 13:35:11,937 cisTarget    INFO     Getting cistromes for Topic7
(ctx_internal_ray pid=25008) 2022-11-02 13:35:11,985 cisTarget    INFO     Running cisTarget for Topic11 which has 1581 regions
(ctx_internal_ray pid=25005) 2022-11-02 13:35:11,990 cisTarget    INFO     Running cisTarget for Topic12 which has 872 regions
(ctx_internal_ray pid=25007) 2022-11-02 13:35:12,227 cisTarget    INFO     Annotating motifs for Topic10
(ctx_internal_ray pid=25006) 2022-11-02 13:35:12,168 cisTarget    INFO     Annotating motifs for Topic9
(ctx_internal_ray pid=25007) 2022-11-02 13:35:12,350 cisTarget    INFO     Getting cistromes for Topic10
(ctx_internal_ray pid=25007) 2022-11-02 13:35:12,373 cisTarget    INFO     Running cisTarget for Topic13 which has 22 regions
(ctx_internal_ray pid=25006) 2022-11-02 13:35:12,419 cisTarget    INFO     Getting cistromes for Topic9
(ctx_internal_ray pid=25005) 2022-11-02 13:35:12,390 cisTarget    INFO     Annotating motifs for Topic12
(ctx_internal_ray pid=25008) 2022-11-02 13:35:12,456 cisTarget    INFO     Annotating motifs for Topic11
(ctx_internal_ray pid=25006) 2022-11-02 13:35:12,560 cisTarget    INFO     Running cisTarget for Topic14 which has 945 regions
(ctx_internal_ray pid=25005) 2022-11-02 13:35:12,616 cisTarget    INFO     Getting cistromes for Topic12
(ctx_internal_ray pid=25007) 2022-11-02 13:35:12,707 cisTarget    INFO     Annotating motifs for Topic13
(ctx_internal_ray pid=25008) 2022-11-02 13:35:12,698 cisTarget    INFO     Getting cistromes for Topic11
(ctx_internal_ray pid=25007) 2022-11-02 13:35:12,850 cisTarget    INFO     Getting cistromes for Topic13
(ctx_internal_ray pid=25005) 2022-11-02 13:35:12,762 cisTarget    INFO     Running cisTarget for Topic15 which has 1362 regions
(ctx_internal_ray pid=25008) 2022-11-02 13:35:12,860 cisTarget    INFO     Running cisTarget for Topic16 which has 1492 regions
(ctx_internal_ray pid=25006) 2022-11-02 13:35:13,013 cisTarget    INFO     Annotating motifs for Topic14
(ctx_internal_ray pid=25006) 2022-11-02 13:35:13,199 cisTarget    INFO     Getting cistromes for Topic14
(ctx_internal_ray pid=25008) 2022-11-02 13:35:13,220 cisTarget    INFO     Annotating motifs for Topic16
(ctx_internal_ray pid=25005) 2022-11-02 13:35:13,211 cisTarget    INFO     Annotating motifs for Topic15
(ctx_internal_ray pid=25008) 2022-11-02 13:35:13,355 cisTarget    INFO     Getting cistromes for Topic16
(ctx_internal_ray pid=25005) 2022-11-02 13:35:13,415 cisTarget    INFO     Getting cistromes for Topic15
2022-11-02 13:35:16,123 cisTarget    INFO     Done!
2022-11-02 13:35:16,124 pycisTarget_wrapper INFO     /home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data/motifs/CTX_topics_otsu_All folder already exists.
2022-11-02 13:35:16,319 pycisTarget_wrapper INFO     Running DEM for topics_otsu
2022-11-02 13:35:16,320 DEM          INFO     Reading DEM database
2022-11-02 13:35:32,955 DEM          INFO     Creating contrast groups
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/base.py:3800, in Index.get_loc(self, key, method, tolerance)
   3799 try:
-> 3800     return self._engine.get_loc(casted_key)
   3801 except KeyError as err:

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Chromosome'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In [16], line 6
      2 import pandas as pd 
      4 custom_annot = pd.read_csv('/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/TSS_annot.txt', header=0, sep='\t')
----> 6 run_pycistarget(
      7     region_sets = region_sets,
      8     species = 'custom',
      9     save_path = os.path.join(work_dir, 'motifs'),
     10     custom_annot = custom_annot,
     11     ctx_db_path = rankings_db,
     12     dem_db_path = scores_db,
     13     path_to_motif_annotations = motif_annotation,
     14     run_without_promoters = False,
     15     n_cpu = 4,
     16     _temp_dir = os.path.join(tmp_dir, 'ray_spill'),
     17     annotation_version = 'v1',
     18     )

File ~/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/scenicplus/src/scenicplus/wrappers/run_pycistarget.py:263, in run_pycistarget(region_sets, species, save_path, custom_annot, save_partial, ctx_db_path, dem_db_path, run_without_promoters, biomart_host, promoter_space, ctx_auc_threshold, ctx_nes_threshold, ctx_rank_threshold, dem_log2fc_thr, dem_motif_hit_thr, dem_max_bg_regions, annotation, motif_similarity_fdr, path_to_motif_annotations, annotation_version, n_cpu, _temp_dir, exclude_motifs, exclude_collection, **kwargs)
    261     for col in exclude_collection:
    262         dem_db.db_scores = dem_db.db_scores[~dem_db.db_scores.index.str.contains(col)]
--> 263 menr['DEM_'+key+'_All'] = DEM(dem_db = dem_db,
    264                    region_sets = regions,
    265                    log2fc_thr = dem_log2fc_thr,
    266                    motif_hit_thr = dem_motif_hit_thr,
    267                    max_bg_regions = dem_max_bg_regions,
    268                    specie = species,
    269                    genome_annotation = annot_dem,
    270                    promoter_space = promoter_space,
    271                    motif_annotation =   annotation,
    272                    motif_similarity_fdr = motif_similarity_fdr, 
    273                    path_to_motif_annotations = path_to_motif_annotations,
    274                    n_cpu = n_cpu,
    275                    annotation_version = annotation_version,
    276                    tmp_dir = save_path,
    277                    _temp_dir= _temp_dir,
    278                    **kwargs)
    279 out_folder = os.path.join(save_path,'DEM_'+key+'_All')
    280 check_folder = os.path.isdir(out_folder)

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pycistarget/motif_enrichment_dem.py:319, in DEM.__init__(self, dem_db, region_sets, specie, subset_motifs, contrasts, name, max_bg_regions, adjpval_thr, log2fc_thr, mean_fg_thr, motif_hit_thr, n_cpu, fraction_overlap, cluster_buster_path, path_to_genome_fasta, path_to_motifs, genome_annotation, promoter_space, path_to_motif_annotations, annotation_version, motif_annotation, motif_similarity_fdr, orthologous_identity_threshold, tmp_dir, **kwargs)
    317 self.cistromes = None
    318 if dem_db is not None:
--> 319     self.run(dem_db.db_scores, **kwargs)

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pycistarget/motif_enrichment_dem.py:365, in DEM.run(self, dem_db_scores, **kwargs)
    363 # Get region groups
    364 log.info('Creating contrast groups')
--> 365 region_groups = [create_groups(contrast = contrasts[x],
    366                            region_sets_names = region_sets_names,
    367                            max_bg_regions = self.max_bg_regions,
    368                            path_to_genome_fasta = self.path_to_genome_fasta,
    369                            path_to_regions_fasta = os.path.join(self.tmp_dir, contrasts_names[x] +'.fa'),  
    370                            cbust_path = self.cluster_buster_path,
    371                            path_to_motifs = self.path_to_motifs,
    372                            annotation = self.genome_annotation,
    373                            promoter_space = self.promoter_space,
    374                            motifs = dem_db_scores.index.tolist(),
    375                            n_cpu = self.n_cpu,
    376                            **kwargs) for x in range(len(contrasts))]
    378 # Compute p-val and log2FC
    379 if self.n_cpu > len(region_groups):

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pycistarget/motif_enrichment_dem.py:365, in <listcomp>(.0)
    363 # Get region groups
    364 log.info('Creating contrast groups')
--> 365 region_groups = [create_groups(contrast = contrasts[x],
    366                            region_sets_names = region_sets_names,
    367                            max_bg_regions = self.max_bg_regions,
    368                            path_to_genome_fasta = self.path_to_genome_fasta,
    369                            path_to_regions_fasta = os.path.join(self.tmp_dir, contrasts_names[x] +'.fa'),  
    370                            cbust_path = self.cluster_buster_path,
    371                            path_to_motifs = self.path_to_motifs,
    372                            annotation = self.genome_annotation,
    373                            promoter_space = self.promoter_space,
    374                            motifs = dem_db_scores.index.tolist(),
    375                            n_cpu = self.n_cpu,
    376                            **kwargs) for x in range(len(contrasts))]
    378 # Compute p-val and log2FC
    379 if self.n_cpu > len(region_groups):

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pycistarget/motif_enrichment_dem.py:522, in create_groups(contrast, region_sets_names, max_bg_regions, path_to_genome_fasta, path_to_regions_fasta, cbust_path, path_to_motifs, annotation, promoter_space, motifs, n_cpu, **kwargs)
    520 # Nr of promoters in the foreground
    521 fg_pr_overlap = pr.PyRanges(region_names_to_coordinates(foreground)).count_overlaps(annotation)
--> 522 fg_pr = coord_to_region_names(fg_pr_overlap[fg_pr_overlap.NumberOverlaps != 0])
    523 if len(fg_pr) == len(foreground):
    524     nr_pr = max_bg_regions

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pycistarget/utils.py:18, in coord_to_region_names(coord)
     16 if isinstance(coord, pr.PyRanges):
     17     coord = coord.as_df()
---> 18     return list(coord['Chromosome'].astype(str) + ':' + coord['Start'].astype(str) + '-' + coord['End'].astype(str))

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pandas/core/frame.py:3805, in DataFrame.__getitem__(self, key)
   3803 if self.columns.nlevels > 1:
   3804     return self._getitem_multilevel(key)
-> 3805 indexer = self.columns.get_loc(key)
   3806 if is_integer(indexer):
   3807     indexer = [indexer]

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/base.py:3802, in Index.get_loc(self, key, method, tolerance)
   3800     return self._engine.get_loc(casted_key)
   3801 except KeyError as err:
-> 3802     raise KeyError(key) from err
   3803 except TypeError:
   3804     # If we have a listlike key, _check_indexing_error will raise
   3805     #  InvalidIndexError. Otherwise we fall through and re-raise
   3806     #  the TypeError.
   3807     self._check_indexing_error(key)

KeyError: 'Chromosome'

Screenshots Here's what my custom_annot pandas df looks like:

image

Here's the structure of my region_sets:

image

Version (please complete the following information):

Additional context Thank you for your help!

Goultard59 commented 1 year ago

Have you checked the consistency between Chromosome in your custom annotation dataframe, your ranking database, your anndata matrix ?

As I remember the time, I build my scenic database. You need to check that cellranger does not modify your gene's names in case of multiple duplicates (for example U6 family genes). You need that every gene of your cell ranger matrix are in your database and in your custom annotation.

SidG13 commented 1 year ago

Hi, thanks for your response. Here's what I did to check my files:

# Check file integrity (Genes) #
> library(anndata)
> ad <- read_h5ad("path/to/my/adata.h5ad")
> ad
AnnData object with n_obs × n_vars = 3865 × 15421
    obs: 'cell_cluster', 'doublet_score', 'predicted_doublet'
    var: 'gene_ids'
    uns: 'cell_cluster_colors', 'scrublet'
    obsm: 'X_umap'

> TSS_annot <- read.table('TSS_annot.txt', header=T, sep='\t') # also called `custom_annot` in python
> ad$var$gene_ids[!(ad$var$gene_ids %in% TSS_annot$Gene)]
  character(0) # no genes from my RNA counts matrix are missing in the TSS annotation file 

# Check file integrity (Chr names) #
> library(arrow)
> rankings_db <- arrow::read_feather('path/to/regions_vs_motifs.rankings.feather')
> sort(unique(str_extract(colnames(rankings_db),'.*(?=:)')))
 [1] "ma1"  "ma2"  "ma3"  "ma4"  "ma5"  "ma6"  "ma7"  "mi1"  "mi10" "mi2"  "mi3"  "mi4"  "mi5"  "mi6"  "mi7"  "mi8"  "mi9"  "myo1" "myo3" "Z"   
> sort(unique(TSS_annot$Chromosome))
 [1] "ma1"   "ma2"   "ma3"   "ma4"   "ma5"   "ma6"   "ma7"   "mi1"   "mi10"  "mi2"   "mi3"   "mi4"   "mi5"   "mi6"   "mi7"   "mi8"   "mi9"   "myo1"  "un187" "Z"

Though there are some chromosomes which are in one set and not the other (because there is a gene on it, but no motifs annotated since it was blacklisted during peak calling), I don't think this should cause a KeyError naming issue since the other names are the same and should be paired/recognized properly.

Sorry for the trouble, but do you see anything wrong perhaps? Is there another dataset I should double check for consistency?

Goultard59 commented 1 year ago

I think that there are no overlaps between the "foreground" variable and your annotation here : https://github.com/aertslab/pycistarget/blob/0ed8289bda8feb8b137f243864213cee6b06d7f0/pycistarget/motif_enrichment_dem.py#L521

resulting in an "NA" value; you should check that you're not out of chromosome with default promoter space and your custom annotation.

SidG13 commented 1 year ago

Here's what my fg_pr_overlap object looks like below. You can see a few overlaps are found, the rest are 0. I don't see NAs. Though, I'm not sure what the data is supposed to look like, @SeppeDeWinter does this look correct to you? I don't know why there are two different dataframes here, but again I don't know what the data is supposed to look like and maybe this is fine.

print(fg_pr_overlap)
+--------------+-----------+-----------+------------------+
| Chromosome   | Start     | End       | NumberOverlaps   |
| (category)   | (int32)   | (int32)   | (int64)          |
|--------------+-----------+-----------+------------------|
| Z            | 37835199  | 37835699  | 0                |
| Z            | 22469724  | 22470224  | 0                |
| Z            | 15777279  | 15777779  | 0                |
| Z            | 89932921  | 89933421  | 0                |
| ...          | ...       | ...       | ...              |
| mi10         | 2433294   | 2433794   | 1                |
| mi10         | 2266775   | 2267275   | 0                |
| mi10         | 185654    | 186154    | 1                |
| mi10         | 1576246   | 1576746   | 0                |
+--------------+-----------+-----------+------------------+
Unstranded PyRanges object has 2,868 rows and 4 columns from 18 chromosomes.
For printing, the PyRanges was sorted on Chromosome.
+--------------+-----------+-----------+------------------+
| Chromosome   | Start     | End       | NumberOverlaps   |
| (category)   | (int32)   | (int32)   | (int64)          |
|--------------+-----------+-----------+------------------|
| Z            | 18489754  | 18490254  | 0                |
| Z            | 60540869  | 60541369  | 0                |
| Z            | 99620210  | 99620710  | 0                |
| Z            | 67880911  | 67881411  | 0                |
| ...          | ...       | ...       | ...              |
| ma5          | 14717191  | 14717691  | 0                |
| ma5          | 37661785  | 37662285  | 0                |
| ma5          | 14718445  | 14718945  | 0                |
| ma5          | 14721267  | 14721767  | 0                |
| ma6          | 11729879  | 11730379  | 0                |
| ma6          | 30263065  | 30263565  | 0                |
| ma7          | 52411710  | 52412210  | 0                |
+--------------+-----------+-----------+------------------+
Unstranded PyRanges object has 23 rows and 4 columns from 8 chromosomes.
For printing, the PyRanges was sorted on Chromosome.

Just to be thorough I checked if any values were indeed NA like so, but got nothing:

print(fg_pr_overlap[fg_pr_overlap.NumberOverlaps == 'NA'])
Empty PyRanges
SeppeDeWinter commented 1 year ago

Hi @SidG13

Sorry for the late reply.

Probably the issue is caused because one of your region sets does not contain any promoters, resulting in an empty pyranges object. This results in an error in the coord_to_region_names function, this is a bug.

The bug should be fixed with this commit https://github.com/aertslab/pycistarget/commit/5c355396393a44458934032e000bd22876a4d071.

Now the function returns an empty list instead of crashing, see example below:

In [1]: coord_to_region_names(pr.PyRanges())
Out[1]: []

p.s. @Goultard59, thanks for the help!

I hope this fixes your issue?

Best,

Seppe

SidG13 commented 1 year ago

Thanks @SeppeDeWinter, the wrapper certainly does go much further now! I don't want to paste all the output here, but I've got all the following folders made (and populated with all htmls) image

But I do get the following error right at the very end, unfortunately I'm not very helpful but it's probably related to pickle?

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In [8], line 8
      5 custom_annot = pd.read_csv('/home/administrator/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/data_manipulation/TSS_annot.txt', header=0, sep='\t')
      6 #custom_annot
----> 8 run_pycistarget(
      9     region_sets = region_sets,
     10     species = 'custom',
     11     save_path = os.path.join(work_dir, 'motifs'),
     12     custom_annot = custom_annot,
     13     ctx_db_path = rankings_db,
     14     dem_db_path = scores_db,
     15     path_to_motif_annotations = motif_annotation,
     16     run_without_promoters = True,
     17     n_cpu = 4,
     18     _temp_dir = os.path.join(tmp_dir, 'ray_spill'),
     19     annotation_version = 'v1',
     20     )

File ~/Desktop/ExtraDrive1/Sid/sc_multiome/SCENICplus_multiome/scenicplus/src/scenicplus/wrappers/run_pycistarget.py:331, in run_pycistarget(region_sets, species, save_path, custom_annot, save_partial, ctx_db_path, dem_db_path, run_without_promoters, biomart_host, promoter_space, ctx_auc_threshold, ctx_nes_threshold, ctx_rank_threshold, dem_log2fc_thr, dem_motif_hit_thr, dem_max_bg_regions, annotation, motif_similarity_fdr, path_to_motif_annotations, annotation_version, n_cpu, _temp_dir, exclude_motifs, exclude_collection, **kwargs)
    329 log.info('Saving object')         
    330 with open(os.path.join(save_path,'menr.pkl'), 'wb') as f:
--> 331     dill.dump(menr, f, protocol=-1)
    333 log.info('Finished! Took {} minutes'.format((time.time() - start_time)/60))

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:336, in dump(obj, file, protocol, byref, fmode, recurse, **kwds)
    334 _kwds = kwds.copy()
    335 _kwds.update(dict(byref=byref, fmode=fmode, recurse=recurse))
--> 336 Pickler(file, protocol, **_kwds).dump(obj)
    337 return

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:620, in Pickler.dump(self, obj)
    618     raise PicklingError(msg)
    619 else:
--> 620     StockPickler.dump(self, obj)
    621 return

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:487, in _Pickler.dump(self, obj)
    485 if self.proto >= 4:
    486     self.framer.start_framing()
--> 487 self.save(obj)
    488 self.write(STOP)
    489 self.framer.end_framing()

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:1251, in save_module_dict(pickler, obj)
   1248     if is_dill(pickler, child=False) and pickler._session:
   1249         # we only care about session the first pass thru
   1250         pickler._first_pass = False
-> 1251     StockPickler.save_dict(pickler, obj)
   1252     log.info("# D2")
   1253 return

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:971, in _Pickler.save_dict(self, obj)
    968     self.write(MARK + DICT)
    970 self.memoize(obj)
--> 971 self._batch_setitems(obj.items())

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:997, in _Pickler._batch_setitems(self, items)
    995     for k, v in tmp:
    996         save(k)
--> 997         save(v)
    998     write(SETITEMS)
    999 elif n:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:1251, in save_module_dict(pickler, obj)
   1248     if is_dill(pickler, child=False) and pickler._session:
   1249         # we only care about session the first pass thru
   1250         pickler._first_pass = False
-> 1251     StockPickler.save_dict(pickler, obj)
   1252     log.info("# D2")
   1253 return

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:971, in _Pickler.save_dict(self, obj)
    968     self.write(MARK + DICT)
    970 self.memoize(obj)
--> 971 self._batch_setitems(obj.items())

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:997, in _Pickler._batch_setitems(self, items)
    995     for k, v in tmp:
    996         save(k)
--> 997         save(v)
    998     write(SETITEMS)
    999 elif n:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:603, in _Pickler.save(self, obj, save_persistent_id)
    599     raise PicklingError("Tuple returned by %s must have "
    600                         "two to six elements" % reduce)
    602 # Save the reduce() output and finally memoize the object
--> 603 self.save_reduce(obj=obj, *rv)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:717, in _Pickler.save_reduce(self, func, args, state, listitems, dictitems, state_setter, obj)
    715 if state is not None:
    716     if state_setter is None:
--> 717         save(state)
    718         write(BUILD)
    719     else:
    720         # If a state_setter is specified, call it instead of load_build
    721         # to update obj's with its previous state.
    722         # First, push state_setter and its tuple of expected arguments
    723         # (obj, state) onto the stack.

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:1251, in save_module_dict(pickler, obj)
   1248     if is_dill(pickler, child=False) and pickler._session:
   1249         # we only care about session the first pass thru
   1250         pickler._first_pass = False
-> 1251     StockPickler.save_dict(pickler, obj)
   1252     log.info("# D2")
   1253 return

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:971, in _Pickler.save_dict(self, obj)
    968     self.write(MARK + DICT)
    970 self.memoize(obj)
--> 971 self._batch_setitems(obj.items())

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:997, in _Pickler._batch_setitems(self, items)
    995     for k, v in tmp:
    996         save(k)
--> 997         save(v)
    998     write(SETITEMS)
    999 elif n:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:603, in _Pickler.save(self, obj, save_persistent_id)
    599     raise PicklingError("Tuple returned by %s must have "
    600                         "two to six elements" % reduce)
    602 # Save the reduce() output and finally memoize the object
--> 603 self.save_reduce(obj=obj, *rv)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:717, in _Pickler.save_reduce(self, func, args, state, listitems, dictitems, state_setter, obj)
    715 if state is not None:
    716     if state_setter is None:
--> 717         save(state)
    718         write(BUILD)
    719     else:
    720         # If a state_setter is specified, call it instead of load_build
    721         # to update obj's with its previous state.
    722         # First, push state_setter and its tuple of expected arguments
    723         # (obj, state) onto the stack.

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/site-packages/dill/_dill.py:1251, in save_module_dict(pickler, obj)
   1248     if is_dill(pickler, child=False) and pickler._session:
   1249         # we only care about session the first pass thru
   1250         pickler._first_pass = False
-> 1251     StockPickler.save_dict(pickler, obj)
   1252     log.info("# D2")
   1253 return

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:971, in _Pickler.save_dict(self, obj)
    968     self.write(MARK + DICT)
    970 self.memoize(obj)
--> 971 self._batch_setitems(obj.items())

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:997, in _Pickler._batch_setitems(self, items)
    995     for k, v in tmp:
    996         save(k)
--> 997         save(v)
    998     write(SETITEMS)
    999 elif n:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:603, in _Pickler.save(self, obj, save_persistent_id)
    599     raise PicklingError("Tuple returned by %s must have "
    600                         "two to six elements" % reduce)
    602 # Save the reduce() output and finally memoize the object
--> 603 self.save_reduce(obj=obj, *rv)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:692, in _Pickler.save_reduce(self, func, args, state, listitems, dictitems, state_setter, obj)
    690 else:
    691     save(func)
--> 692     save(args)
    693     write(REDUCE)
    695 if obj is not None:
    696     # If the object is already in the memo, this means it is
    697     # recursive. In this case, throw away everything we put on the
    698     # stack, and fetch the object back from the memo.

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:886, in _Pickler.save_tuple(self, obj)
    884 if n <= 3 and self.proto >= 2:
    885     for element in obj:
--> 886         save(element)
    887     # Subtle.  Same as in the big comment below.
    888     if id(obj) in memo:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:886, in _Pickler.save_tuple(self, obj)
    884 if n <= 3 and self.proto >= 2:
    885     for element in obj:
--> 886         save(element)
    887     # Subtle.  Same as in the big comment below.
    888     if id(obj) in memo:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:603, in _Pickler.save(self, obj, save_persistent_id)
    599     raise PicklingError("Tuple returned by %s must have "
    600                         "two to six elements" % reduce)
    602 # Save the reduce() output and finally memoize the object
--> 603 self.save_reduce(obj=obj, *rv)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:692, in _Pickler.save_reduce(self, func, args, state, listitems, dictitems, state_setter, obj)
    690 else:
    691     save(func)
--> 692     save(args)
    693     write(REDUCE)
    695 if obj is not None:
    696     # If the object is already in the memo, this means it is
    697     # recursive. In this case, throw away everything we put on the
    698     # stack, and fetch the object back from the memo.

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:886, in _Pickler.save_tuple(self, obj)
    884 if n <= 3 and self.proto >= 2:
    885     for element in obj:
--> 886         save(element)
    887     # Subtle.  Same as in the big comment below.
    888     if id(obj) in memo:

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:603, in _Pickler.save(self, obj, save_persistent_id)
    599     raise PicklingError("Tuple returned by %s must have "
    600                         "two to six elements" % reduce)
    602 # Save the reduce() output and finally memoize the object
--> 603 self.save_reduce(obj=obj, *rv)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:692, in _Pickler.save_reduce(self, func, args, state, listitems, dictitems, state_setter, obj)
    690 else:
    691     save(func)
--> 692     save(args)
    693     write(REDUCE)
    695 if obj is not None:
    696     # If the object is already in the memo, this means it is
    697     # recursive. In this case, throw away everything we put on the
    698     # stack, and fetch the object back from the memo.

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:901, in _Pickler.save_tuple(self, obj)
    899 write(MARK)
    900 for element in obj:
--> 901     save(element)
    903 if id(obj) in memo:
    904     # Subtle.  d was not in memo when we entered save_tuple(), so
    905     # the process of saving the tuple's elements must have saved
   (...)
    909     # could have been done in the "for element" loop instead, but
    910     # recursive tuples are a rare thing.
    911     get = self.get(memo[id(obj)][0])

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:560, in _Pickler.save(self, obj, save_persistent_id)
    558 f = self.dispatch.get(t)
    559 if f is not None:
--> 560     f(self, obj)  # Call unbound method with explicit self
    561     return
    563 # Check private dispatch table if any, or else
    564 # copyreg.dispatch_table

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:839, in _Pickler.save_picklebuffer(self, obj)
    835 if in_band:
    836     # Write data in-band
    837     # XXX The C implementation avoids a copy here
    838     if m.readonly:
--> 839         self.save_bytes(m.tobytes())
    840     else:
    841         self.save_bytearray(m.tobytes())

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:806, in _Pickler.save_bytes(self, obj)
    804 else:
    805     self.write(BINBYTES + pack("<I", n) + obj)
--> 806 self.memoize(obj)

File ~/anaconda2/envs/py38/lib/python3.8/pickle.py:508, in _Pickler.memoize(self, obj)
    506 if self.fast:
    507     return
--> 508 assert id(obj) not in self.memo
    509 idx = len(self.memo)
    510 self.write(self.put(idx))

AssertionError:

When I try the next codeblock to open the menr.pkl file:

image

SeppeDeWinter commented 1 year ago

Hi @SidG13

That's very unfortunate ... I have never seen this so I'm not sure what happened.

I would suggest running the function again but setting save_partial to True. That way intermediate results will be saved i the folders you showed above. Then you have a backup incase the saving fails again.

Best,

Seppe

SidG13 commented 1 year ago

Thanks @SeppeDeWinter, I'm not sure how to use the backup files that were made since I can't proceed with the workflow, in fact even running with save_partial = True still throws a (what I think is) dill error. It's really odd that the menr.pkl file is made however, so it's probably just being corrupted somehow. Do you think it's a python version/dill version error?

Would it help if I sent you the requisite data (annotation, database files etc.) to test on your end to see if the error can be replicated?

SeppeDeWinter commented 1 year ago

Hi @SidG13

If you don't mind sending me the data then I'll do some tests after the weekend. I'm really not sure what is going wrong.

You can send it to seppe.dewinter@kuleuven.be

Best

Seppe

SidG13 commented 1 year ago

Thanks so much! Just sent.

Goultard59 commented 1 year ago

Hi,

After trying the pipelines with more samples, I encounter the same two errors.

Traceback (most recent call last):
  File "/home/adufour/save/scripts/omicscenic.py", line 166, in <module>
    run_pycistarget(
  File "/work/adufour/scenicplus/src/scenicplus/wrappers/run_pycistarget.py", line 331, in run_pycistarget
    dill.dump(menr, f, protocol=-1)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 336, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 620, in dump
    StockPickler.dump(self, obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 487, in dump
    self.save(obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 717, in save_reduce
    save(state)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 717, in save_reduce
    save(state)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/dill/_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 692, in save_reduce
    save(args)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 886, in save_tuple
    save(element)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 886, in save_tuple
    save(element)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 692, in save_reduce
    save(args)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 886, in save_tuple
    save(element)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 692, in save_reduce
    save(args)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 901, in save_tuple
    save(element)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 839, in save_picklebuffer
    self.save_bytes(m.tobytes())
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 806, in save_bytes
    self.memoize(obj)
  File "/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/pickle.py", line 508, in memoize
    assert id(obj) not in self.memo
AssertionError
Fatal error condition occurred in /opt/vcpkg/buildtrees/aws-c-io/src/9e6648842a-364b708815.clean/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200af06) [0x7feb9adf3f06]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x20028e5) [0x7feb9adeb8e5]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f27e09) [0x7feb9ad10e09]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) [0x7feb9adf4a3d]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f25948) [0x7feb9ad0e948]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) [0x7feb9adf4a3d]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1ee0b46) [0x7feb9acc9b46]
/work/adufour/anaconda3/envs/scenicplus/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x194546a) [0x7feb9a72e46a]
/lib64/libc.so.6(+0x39ce9) [0x7fec466a7ce9]
/lib64/libc.so.6(+0x39d37) [0x7fec466a7d37]
/lib64/libc.so.6(__libc_start_main+0xfc) [0x7fec4669055c]
python(+0x1ce125) [0x5566c3af3125]
/var/spool/slurm/d/job42616761/slurm_script : ligne 12 : 25552 Abandon                 python /home/adufour/save/scripts/omicscenic.py
Goultard59 commented 1 year ago

upgrading to pyarrow >10.0 solves the problems for me.

SidG13 commented 1 year ago

Unfortunately I'm still getting the same error as before. I updated to pyarrow v10.0.1. My guess is it's a dill error, my version is 0.3.5.1

SeppeDeWinter commented 1 year ago

@SidG13

I finally got around to trying to debug the issue using your data, my apologies for the delay (it has been a very busy period).

I was able to save the motif enrichment dictionary (menr) for your data using protocol 4 of dill. I pushed a change to the developmental branch so this protocol is used by default (https://github.com/aertslab/scenicplus/commit/f322c5b09b54213b8ec5062f25cc0b4143e92ee4). If you pull the code from that branch (git clone --branch development https://github.com/aertslab/scenicplus.git) and reinstall scenicplus you should be able to save the motif enrichment results.

I hope this fixes this annoying issue.

Best,

Seppe

SidG13 commented 1 year ago

Thank you, this worked!