Closed alexwang1001 closed 1 year ago
.get_feature_names_out() instead of get_feature_names()
.get_feature_names_out() instead of get_feature_names()
That is what I thought as well. Is this a bug for run_scenicplus
that needs to be fixed?
Hi both
You're right. get_feature_names
got replaced by get_feature_names_out
(see: https://github.com/scikit-learn/scikit-learn/pull/18444). I will update the code.
Best,
Seppe
Hi everyone, I am running into the same issue using Scenic+1.01:
, line 174, in export_to_loom
), columns=cv.get_feature_names(), index=regulons.keys())
AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'
I have created my pyscenic file using the method described in https://github.com/aertslab/scenicplus/issues/48#issuecomment-1285838142_ as I am trying to make scenicplus run for zebrafish.
Would be very happy for any ideas! Best, Jo
I received an identical error as the OP while running run_scenicplus on the PBMC tutorial using scenicplus 1.0.1.dev2+g26677cb.
a patch seems available in developmeent branch.
@SeppeDeWinter should I switch the scenicplus git repo to developement branch?
Do you plan to merge the fix to master branch?
thank in advance for your help
Getting this error as well using scenicplus v. 1.0.1.dev4+ge4bdd9f
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[59], line 23
20 except Exception as e:
21 #in case of failure, still save the object
22 dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
---> 23 raise(e)
Cell In[59], line 3
1 from scenicplus.wrappers.run_scenicplus import run_scenicplus
2 try:
----> 3 run_scenicplus(
4 scplus_obj = scplus_obj,
5 variable = [ KEY_TO_GROUP_BY_1 ],
6 species = 'mmusculus', # hsapiens mmusculus
7 assembly = 'mm10', # hg38 mm10
8 tf_file = '/media/solvi/WORKSITE1001/refDBs/allTFs_mm.txt',
9 save_path = os.path.join(work_dir, 'scenicplus'),
10 biomart_host = biomart_host,
11 upstream = [1000, 150000],
12 downstream = [1000, 150000],
13 calculate_TF_eGRN_correlation = True,
14 calculate_DEGs_DARs = True,
15 export_to_loom_file = True,
16 export_to_UCSC_file = True,
17 path_bedToBigBed = 'MU4',
18 n_cpu = NCPUS ,
19 _temp_dir = os.path.join(tmpDir, 'ray_spill'))
20 except Exception as e:
21 #in case of failure, still save the object
22 dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
File ~/scenicplus/src/scenicplus/wrappers/run_scenicplus.py:323, in run_scenicplus(scplus_obj, variable, species, assembly, tf_file, save_path, biomart_host, upstream, downstream, region_ranking, gene_ranking, simplified_eGRN, calculate_TF_eGRN_correlation, calculate_DEGs_DARs, export_to_loom_file, export_to_UCSC_file, tree_structure, path_bedToBigBed, n_cpu, _temp_dir, save_partial, **kwargs)
321 if export_to_loom_file is True:
322 log.info('Exporting to loom file')
--> 323 export_to_loom(scplus_obj,
324 signature_key = 'Gene_based',
325 tree_structure = tree_structure,
326 title = 'Gene based eGRN',
327 nomenclature = assembly,
328 out_fname=os.path.join(save_path,'SCENIC+_gene_based.loom'))
329 export_to_loom(scplus_obj,
330 signature_key = 'Region_based',
331 tree_structure = tree_structure,
332 title = 'Region based eGRN',
333 nomenclature = assembly,
334 out_fname=os.path.join(save_path,'SCENIC+_region_based.loom'))
336 if export_to_UCSC_file is True:
File ~/scenicplus/src/scenicplus/loom.py:174, in export_to_loom(scplus_obj, signature_key, out_fname, eRegulon_metadata_key, auc_key, auc_thr_key, keep_direct_and_extended_if_not_direct, selected_features, selected_cells, cluster_annotation, tree_structure, title, nomenclature)
170 cv = CountVectorizer(
171 lowercase=False, token_pattern=r'(?u)\b\w\w+\b:\b\w\w+\b-\b\w\w+\b')
172 regulon_mat = cv.fit_transform(regulons.values())
173 regulon_mat = pd.DataFrame(regulon_mat.todense(
--> 174 ), columns=cv.get_feature_names(), index=regulons.keys())
175 regulon_mat = regulon_mat.reindex(columns=feature_names, fill_value=0).T
176 if keep_direct_and_extended_if_not_direct is True:
AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'
Hi,
I got around this problem (if I remember well) using the follwing singularity container. Here is the recipe to build container:
# to build: singularity build --force --fakeroot scenicplus.sif scenicplus.def
BootStrap: docker
From: ubuntu:22.04
%setup
%environment
export PATH=/miniconda3/bin:$PATH
export PATH=/ucsc.v386:$PATH
%post
apt-get update && apt-get -y upgrade
ln -fs /usr/share/zoneinfo/America/New_York /etc/localtime
# # needed for concoct
export DEBIAN_FRONTEND=noninteractive
apt-get -y install \
build-essential \
wget \
git \
less \
rsync \
curl libcurl4 \
python3 python3-dev python3-pybedtools
cd /
wget -c https://repo.anaconda.com/miniconda/Miniconda3-py39_4.11.0-Linux-x86_64.sh
/bin/bash Miniconda3-py39_4.11.0-Linux-x86_64.sh -bfp /miniconda3
export PATH=/miniconda3/bin:$PATH
conda config --file /miniconda3/.condarc --add channels defaults
conda config --file /miniconda3/.condarc --add channels conda-forge
conda config --file /miniconda3/.condarc --add channels bioconda
conda config --file /miniconda3/.condarc --add channels ursky
echo ". /miniconda3/etc/profile.d/conda.sh" >> $SINGULARITY_ENVIRONMENT
echo "conda activate scenicplus" >> $SINGULARITY_ENVIRONMENT
. /miniconda3/etc/profile.d/conda.sh
conda create --name scenicplus python=3.8
conda activate scenicplus
cd /
mkdir /ucsc.v386
cd /ucsc.v386
wget -O bedToBigBed http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedToBigBed
chmod a+x /ucsc.v386/*
cd /
wget https://github.com/macs3-project/MACS/archive/refs/tags/v2.2.7.1.tar.gz -O MACS.tar.gz
tar -xvf MACS.tar.gz
cd MACS-2.2.7.1
sed -i 's/install_requires = \[f"numpy>={numpy_requires}",\]/install_requires = \[f"numpy{numpy_requires}",\]/' setup.py
pip install -e .
conda install --channel conda-forge --channel bioconda bedtools htslib pyrle pybedtools scanpy python-igraph leidenalg
cd /
git clone https://github.com/aertslab/scenicplus
cd scenicplus
# patch https://github.com/aertslab/scenicplus/commit/821ee7b719afbd1d1e74aadb3ffda9e27165c930
sed -i 's/get_feature_names/get_feature_names_out/' /scenicplus/src/scenicplus/loom.py
pip install -e .
conda install --channel conda-forge numpy=1.23.5 --force
pip install louvain
Hope this helps!
Hi both
You're right.
get_feature_names
got replaced byget_feature_names_out
(see: scikit-learn/scikit-learn#18444). I will update the code.Best,
Seppe @SeppeDeWinter Hi, Does this mean we have to just update the scikit-learn?
Hi! I was running scenicplus PBMC 3K tutorial using the singularity container. When I run the following code at the indicated step:
I got this error:
Do you know why and could you help me fix it? Thank you! Li