Open YangLi-Bio opened 1 year ago
Hi @YangLi-Bio
Downgrading your pydantic version to 1.x might solve your issue, see: https://discuss.ray.io/t/pydantic-dataclasses-dataclass-only-supports-init-false/11278.
This piece of code can help with debugging:
import ray
ray.init(
num_cpus = 5,
_temp_dir = os.path.join(tmp_dir + 'ray_spill'))
Best,
Seppe
Dear Seppe,
Thanks for your patience and time.
This error still occurs even though I downgraded pydantic to 1.10.11. Besides, I have succeeded to run the piece of code that you provided. However, the error has not changed.
Could you please help figure out a feasible solution?
Best regards,
Hi @YangLi-Bio
Sorry to hear that downgrading did not work.
Just to be sure, running
import ray
ray.init(
num_cpus = 5,
_temp_dir = os.path.join(tmp_dir + 'ray_spill'))
does not cause any error?
Best,
Seppe
Dear Seppe,
No. It did not cause any error. I have re-run the full script and found that another error occurs instead of this one as follows:
Finished Topic modeling
Finished model evaluation
Traceback (most recent call last):
File "2-scenicplus-preprocessing.py", line 119, in <module>
region_bin_topics_top3k = binarize_topics(cistopic_obj, method='ntop', ntop = 3000)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/topic_binarization.py", line 121, in binarize_topics
data.iloc[
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/core/indexing.py", line 1068, in __getitem__
return self._getitem_tuple(key)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/core/indexing.py", line 1564, in _getitem_tuple
tup = self._validate_tuple_indexer(tup)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/core/indexing.py", line 874, in _validate_tuple_indexer
self._validate_key(k, i)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/core/indexing.py", line 1467, in _validate_key
self._validate_integer(key, axis)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/core/indexing.py", line 1558, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
The codes are as follows:
print ("Finished model evaluation")
# Inferring candidate enhancer regions
from pycisTopic.topic_binarization import *
region_bin_topics_otsu = binarize_topics(cistopic_obj, method='otsu')
region_bin_topics_top3k = binarize_topics(cistopic_obj, method='ntop', ntop = 3000)
Hi @YangLi-Bio
Can you show:
cistopic_obj.selected_model.topic_region
Best,
Seppe
I encountered the same error that could be bypassed by removing the last line _temp_dir
from the input as below:
models=run_cgs_models(cistopic_obj,
n_topics=[2,4,10,16,32,48],
n_cpu=40,
n_iter=500,
random_state=555,
alpha=50,
alpha_by_topic=True,
eta=0.1,
eta_by_topic=False,
save_path=None)
# _temp_dir = os.path.join(tmp_dir + 'ray_spill')) #error with this
Describe the bug I succeeded to run
create_cistopic_object_from_fragments
and save the resultingcistopic_obj
to dist. However, when I tried to runrun_cgs_models
, I got an error message as well as a lot of accessory bugs. The error isAssertionError: pydantic.dataclasses.dataclass only supports init=False
.To Reproduce `# supress warnings import warnings warnings.simplefilter(action = 'ignore', category = FutureWarning) import sys import os _stderr = sys.stderr null = open(os.devnull,'wb') work_dir = '/fs/ess/PCON0022/liyang/STREAM-revision/Feasibility/scenic-plus-data/' tmp_dir = 'tmp_dir/'
scRNA-seq preprocessing using Scanpy
import scanpy as sc adata = sc.read_h5ad(work_dir + '10X_hg38_PBMC_3k_3k_3k.h5ad')
Data normalization
adata.raw = adata sc.pp.normalize_total(adata, target_sum = 1e4) sc.pp.log1p(adata) sc.pp.highly_variable_genes(adata, min_mean = 0.0125, max_mean = 3, min_disp = 0.5) adata = adata[:, adata.var.highly_variable] sc.pp.scale(adata, max_value = 10)
Cell type annotation
sc.pp.neighbors(adata, n_neighbors = 10, n_pcs = 10) sc.tl.umap(adata) sc.tl.leiden(adata, resolution = 0.8, key_added = 'leiden_res_0.8') sc.pl.umap(adata, color = 'leiden_res_0.8', save = "_cell_clusters.png") adata.obs['celltype'] = adata.obs['leiden_res_0.8'] adata.write(os.path.join(work_dir, 'adata_GEX.h5ad'), compression = 'gzip')
scATAC-seq preprocessing using pycisTopic
scRNA_bc = adata.obs_names cell_data = adata.obs cell_data['sample_id'] = '10x_pbmc' cell_data['celltype'] = cell_data['celltype'].astype(str)
Load scATAC-seq data
import pickle import os import pycisTopic fragments_dict = {'10x_pbmc': os.path.join(work_dir, '../10X_hg38_PBMC_3k_3k_3k_fragments.tsv.gz')} path_to_regions = {'10x_pbmc': os.path.join(work_dir, '10X_hg38_PBMC_3k_3k_3k.bed')} path_to_blacklist = '/fs/ess/PCON0022/liyang/STREAM-revision/Feasibility/scenic-plus/blacklist-regions/hg38_ENCFF356LFX.bed'
Create pycisTopic object
from pycisTopic.cistopic_class import * key = '10x_pbmc' cistopic_obj = create_cistopic_object_from_fragments( path_to_fragments = fragments_dict[key], path_to_regions = path_to_regions[key], path_to_blacklist=path_to_blacklist, valid_bc = list(set(scRNA_bc)), n_cpu = 1, project = key, split_pattern = '-') cistopic_obj.add_cell_data(cell_data, split_pattern = '-') print(cistopic_obj) pickle.dump(cistopic_obj, open(os.path.join(work_dir, 'scATAC_cistopic_obj.pkl'), 'wb'))
Topic modeling
models = run_cgs_models(cistopic_obj, n_topics = [2, 4, 10, 16, 32, 48], n_cpu = 5, n_iter = 500, random_state = 555, alpha = 50, alpha_by_topic = True, eta = 0.1, eta_by_topic = False, save_path = None, _temp_dir = os.path.join(tmp_dir + 'ray_spill')) pickle.dump(models, open(os.path.join(work_dir, 'scATAC_models.pkl'), 'wb'))`
Error output `AssertionError: pydantic.dataclasses.dataclass only supports init=False Columns ['sample_id'] will be overwritten CistopicObject from project 10x_pbmc with n_cells × n_regions = 320 × 1678 Traceback (most recent call last): File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/ray/_private/node.py", line 293, in init ray._private.services.wait_for_node( File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/ray/_private/services.py", line 459, in wait_for_node raise TimeoutError( TimeoutError: Timed out after 30 seconds while waiting for node to startup. Did not find socket name tmp_dir/ray_spill/session_2023-07-18_06-33-23_115215_200125/sockets/plasma_store in the list of object store socket names.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "2-Run-scenicplus.py", line 84, in
models = run_cgs_models(cistopic_obj,
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/lda_models.py", line 154, in run_cgs_models
ray.init(num_cpus=n_cpu, *kwargs)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(args, **kwargs)
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/ray/_private/worker.py", line 1534, in init
_global_node = ray._private.node.Node(
File "/users/PAS1475/liyang/.conda/envs/scenicplus/lib/python3.8/site-packages/ray/_private/node.py", line 298, in init
raise Exception(
Exception: The current node timed out during startup. This could happen because some of the Ray processes failed to startup.`
Expected behavior I expected to get the results of Topic modeling.
Screenshots
Version (please complete the following information):