Closed Cyaneiss closed 4 weeks ago
Hi @Cyaneiss,
$ conda list
.AttributeError: module 'fuc.api.common' has no attribute '_script_name'
please see #60.Steven
Hello @sbslee , thanks for the fast answer,
First of all, here's the list of packages on my (base) env, here I installed PyPGx with a git cloning. I essentially use it as my sandbox, so there's quite a few.
# packages in environment at /home/bioinfo-bioch/miniconda3:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 2.1.0 pypi_0 pypi
aldy 4.6 pypi_0 pypi
anaconda-anon-usage 0.4.4 py312hfc0e8ea_100
annotated-types 0.7.0 pypi_0 pypi
appdirs 1.4.4 pypi_0 pypi
archspec 0.2.3 pyhd3eb1b0_0
argparse-dataclass 2.0.0 pypi_0 pypi
attrs 24.2.0 pypi_0 pypi
biopython 1.84 pypi_0 pypi
boltons 23.0.0 py312h06a4308_0
brotli-python 1.0.9 py312h6a678d5_8
bzip2 1.0.8 h5eee18b_6
c-ares 1.19.1 h5eee18b_0
ca-certificates 2024.7.2 h06a4308_0
certifi 2024.7.4 py312h06a4308_0
cffi 1.16.0 py312h5eee18b_1
charset-normalizer 3.3.2 pyhd3eb1b0_0
click 8.1.7 pypi_0 pypi
coloredlogs 15.0.1 pypi_0 pypi
colormath 3.0.0 pypi_0 pypi
conda 24.7.1 py312h06a4308_0
conda-content-trust 0.2.0 py312h06a4308_1
conda-inject 1.3.2 pypi_0 pypi
conda-libmamba-solver 24.7.0 pyhd3eb1b0_0
conda-package-handling 2.3.0 py312h06a4308_0
conda-package-streaming 0.10.0 py312h06a4308_0
configargparse 1.7 pypi_0 pypi
connection-pool 0.0.3 pypi_0 pypi
contourpy 1.3.0 pypi_0 pypi
cryptography 42.0.5 py312hdda0065_1
cycler 0.12.1 pypi_0 pypi
cython 3.0.11 pypi_0 pypi
datrie 0.8.2 pypi_0 pypi
distro 1.9.0 py312h06a4308_0
docutils 0.21.2 pypi_0 pypi
dpath 2.2.0 pypi_0 pypi
expat 2.6.2 h6a678d5_0
fastjsonschema 2.20.0 pypi_0 pypi
fmt 9.1.0 hdb19cb5_1
fonttools 4.54.1 pypi_0 pypi
frozendict 2.4.2 py312h06a4308_0
fuc 0.38.0 pypi_0 pypi
gitdb 4.0.11 pypi_0 pypi
gitpython 3.1.43 pypi_0 pypi
humanfriendly 10.0 pypi_0 pypi
humanize 4.10.0 pypi_0 pypi
icu 73.1 h6a678d5_0
idna 3.7 py312h06a4308_0
immutabledict 4.2.0 pypi_0 pypi
immutables 0.21 pypi_0 pypi
importlib-metadata 8.5.0 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
inotify-simple 1.3.5 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
joblib 1.4.2 pypi_0 pypi
jsonpatch 1.33 py312h06a4308_1
jsonpointer 2.1 pyhd3eb1b0_0
jsonschema 4.23.0 pypi_0 pypi
jsonschema-specifications 2024.10.1 pypi_0 pypi
jupyter-core 5.7.2 pypi_0 pypi
kaleido 0.2.1 pypi_0 pypi
kiwisolver 1.4.7 pypi_0 pypi
kmerexplor 1.1.0 pypi_0 pypi
krb5 1.20.1 h143b758_1
ld_impl_linux-64 2.38 h1181459_1
libarchive 3.6.2 hfab0078_4
libcurl 8.7.1 h251f7ec_0
libedit 3.1.20230828 h5eee18b_0
libev 4.33 h7f8727e_1
libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libmamba 1.5.8 hfe524e5_2
libmambapy 1.5.8 py312h2dafd23_2
libnghttp2 1.57.0 h2d74bed_0
libsolv 0.7.24 he621ea3_1
libssh2 1.11.0 h251f7ec_0
libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
libxml2 2.13.1 hfdd30dd_2
logbook 1.7.0.post0 pypi_0 pypi
lxml 5.3.0 pypi_0 pypi
lz4-c 1.9.4 h6a678d5_1
mappy 2.28 pypi_0 pypi
markdown 3.7 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
matplotlib 3.9.2 pypi_0 pypi
matplotlib-venn 1.1.1 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
menuinst 2.1.2 py312h06a4308_0
multiqc 1.25 pypi_0 pypi
natsort 8.4.0 pypi_0 pypi
nbformat 5.10.4 pypi_0 pypi
ncls 0.0.68 pypi_0 pypi
ncurses 6.4 h6a678d5_0
networkx 3.3 pypi_0 pypi
numpy 2.1.1 pypi_0 pypi
openssl 3.0.14 h5eee18b_0
ortools 9.11.4210 pypi_0 pypi
packaging 24.1 py312h06a4308_0
pandas 2.2.3 pypi_0 pypi
patsy 0.5.6 pypi_0 pypi
pcre2 10.42 hebb0a14_1
pillow 10.4.0 pypi_0 pypi
pip 24.2 py312h06a4308_0
plac 1.4.3 pypi_0 pypi
platformdirs 3.10.0 py312h06a4308_0
plotly 5.24.1 pypi_0 pypi
pluggy 1.5.0 pypi_0 pypi
protobuf 5.26.1 pypi_0 pypi
psutil 6.0.0 pypi_0 pypi
pulp 2.9.0 pypi_0 pypi
pybind11-abi 5 hd3eb1b0_0
pycosat 0.6.6 py312h5eee18b_1
pycparser 2.21 pyhd3eb1b0_0
pydantic 2.9.2 pypi_0 pypi
pydantic-core 2.23.4 pypi_0 pypi
pygments 2.18.0 pypi_0 pypi
pyparsing 3.1.4 pypi_0 pypi
pypgx 0.25.0 pypi_0 pypi
pyranges 0.1.2 pypi_0 pypi
pysam 0.22.1 pypi_0 pypi
pysocks 1.7.1 py312h06a4308_0
pytest 8.3.3 pypi_0 pypi
python 3.12.4 h5148396_1
python-dateutil 2.9.0.post0 pypi_0 pypi
pytz 2024.2 pypi_0 pypi
pyyaml 6.0.2 pypi_0 pypi
readline 8.2 h5eee18b_0
referencing 0.35.1 pypi_0 pypi
reproc 14.2.4 h6a678d5_2
reproc-cpp 14.2.4 h6a678d5_2
requests 2.32.3 py312h06a4308_0
reretry 0.11.8 pypi_0 pypi
rich 13.8.1 pypi_0 pypi
rich-click 1.8.3 pypi_0 pypi
rpds-py 0.20.0 pypi_0 pypi
ruamel.yaml 0.17.21 py312h5eee18b_0
scikit-learn 1.5.2 pypi_0 pypi
scipy 1.14.1 pypi_0 pypi
seaborn 0.13.2 pypi_0 pypi
setuptools 72.1.0 py312h06a4308_0
six 1.16.0 pypi_0 pypi
smart-open 7.0.5 pypi_0 pypi
smmap 5.0.1 pypi_0 pypi
snakemake 8.23.0 pypi_0 pypi
snakemake-interface-common 1.17.4 pypi_0 pypi
snakemake-interface-executor-plugins 9.3.2 pypi_0 pypi
snakemake-interface-report-plugins 1.1.0 pypi_0 pypi
snakemake-interface-storage-plugins 3.3.0 pypi_0 pypi
sorted-nearest 0.0.39 pypi_0 pypi
spectra 0.0.11 pypi_0 pypi
sqlite 3.45.3 h5eee18b_0
statsmodels 0.14.4 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
tenacity 9.0.0 pypi_0 pypi
threadpoolctl 3.5.0 pypi_0 pypi
throttler 1.2.2 pypi_0 pypi
tk 8.6.14 h39e8969_0
tqdm 4.66.4 py312he106c6f_0
traitlets 5.14.3 pypi_0 pypi
truststore 0.8.0 py312h06a4308_0
typeguard 4.3.0 pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2024.2 pypi_0 pypi
urllib3 2.2.2 py312h06a4308_0
wheel 0.43.0 py312h06a4308_0
wrapt 1.16.0 pypi_0 pypi
xz 5.4.6 h5eee18b_1
yaml-cpp 0.8.0 h6a678d5_1
yte 1.5.4 pypi_0 pypi
zipp 3.20.2 pypi_0 pypi
zlib 1.2.13 h5eee18b_1
zstandard 0.22.0 py312h2c38b39_0
zstd 1.5.5 hc292b87_2
And secondly, I forgot to include the error I get during the GeT-RM, my bad ! Here it is :
(base) user:~/Documents/JCB/pypgx/getrm-wgs-tutorial$ pypgx run-ngs-pipeline \
CYP2D6 \
grch37-CYP2D6-pipeline \
--variants grch37-variants.vcf.gz \
--depth-of-coverage grch37-depth-of-coverage.zip \
--control-statistics grch37-control-statistics-VDR.zip
Saved VcfFrame[Imported] to: grch37-CYP2D6-pipeline/imported-variants.zip
Saved VcfFrame[Phased] to: grch37-CYP2D6-pipeline/phased-variants.zip
Saved VcfFrame[Consolidated] to: grch37-CYP2D6-pipeline/consolidated-variants.zip
Saved SampleTable[Alleles] to: grch37-CYP2D6-pipeline/alleles.zip
Saved CovFrame[ReadDepth] to: grch37-CYP2D6-pipeline/read-depth.zip
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/utils.py:456: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '0 2.0000
1 2.0625
2 2.0625
3 2.0625
4 2.0625
...
39379 0.5625
39380 0.5625
39381 0.5625
39382 0.5625
39383 0.5625
Name: NA19143_PyPGx, Length: 39384, dtype: float64' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
df.iloc[:, 2:] = df.iloc[:, 2:] / medians * 2
Saved CovFrame[CopyNumber] to: grch37-CYP2D6-pipeline/copy-number.zip
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/sklearn/base.py:376: InconsistentVersionWarning: Trying to unpickle estimator SVC from version 0.24.2 when using version 1.5.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/sklearn/base.py:376: InconsistentVersionWarning: Trying to unpickle estimator LabelBinarizer from version 0.24.2 when using version 1.5.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/sklearn/base.py:376: InconsistentVersionWarning: Trying to unpickle estimator OneVsRestClassifier from version 0.24.2 when using version 1.5.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/utils.py:151: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.
df = df.fillna(method='ffill')
/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/utils.py:152: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.
df = df.fillna(method='bfill')
Saved SampleTable[CNVCalls] to: grch37-CYP2D6-pipeline/cnv-calls.zip
Saved SampleTable[Genotypes] to: grch37-CYP2D6-pipeline/genotypes.zip
Saved SampleTable[Phenotypes] to: grch37-CYP2D6-pipeline/phenotypes.zip
Saved SampleTable[Results] to: grch37-CYP2D6-pipeline/results.zip
The pandas error part repeat itself a lot of times so I cut them one and only pasted one.
I will come back to you about trying pypgx with conda install once I tested it out further.
Best regards, Peter
@Cyaneiss,
Please note that the "error" messages you received from the GeT-RM tutorial are actually warnings. They are fine. The fact that you obtained the SampleTable[Results]
file means everything worked well.
You can follow this section in the tutorial to make sure the results you got are accruate:
$ wget https://raw.githubusercontent.com/sbslee/pypgx-data/main/getrm-wgs-tutorial/grch37-CYP2D6-results.zip
$ pypgx compare-genotypes grch37-CYP2D6-pipeline/results.zip grch37-CYP2D6-results.zip
# Genotype
Total: 70
Compared: 70
Concordance: 1.000 (70/70)
# CNV
Total: 70
Compared: 70
Concordance: 1.000 (70/70)
@Cyaneiss,
Please note that the "error" messages you received from the GeT-RM tutorial are actually warnings. They are fine. The fact that you obtained the
SampleTable[Results]
file means everything worked well.
Thank you for the clarification. I thought these warnings were part of what caused my initial error. I get 100% concordance after using compare-genotypes, as you said. So the problem isn't caused by the installation itself. Do you maybe have an idea of what causes the error ?
Concerning my tries with installing and running PyPGx with conda, I tried the solution you linked me to (both with v0.15.0 and v0.25.0 of PyPGx), but got the following error :
(base) user:~/Documents/JCB/pypgx/bamSophia$ conda create -n pypgx -c bioconda conda-forge pypgx=0.15.0 fuc=0.33.1
Channels:
- bioconda
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed
LibMambaUnsatisfiableError: Encountered problems while solving:
- nothing provides matplotlib-venn needed by fuc-0.33.1-pyh5e36f6f_0
- nothing provides requested conda-forge
Could not solve for environment specs
The following packages are incompatible
├─ conda-forge does not exist (perhaps a typo or a missing channel);
└─ fuc 0.33.1** is not installable because it requires
└─ matplotlib-venn, which does not exist (perhaps a missing channel).
Including conda-forge as channel seemed to solve this error, but later when trying to run run-ngs-pipeline
on CYP2D6 (I also tried with TMPT just in case), I got another error with matplotlib :
(pypgx) user:~/Documents/JCB/pypgx/bamSophia$ pypgx run-ngs-pipeline CYP2D6 grch37-CYP2D6-pipeline --variants grch37-variants.vcf.gz --depth-of-coverage grch37-depth-of-coverage.zip --control-statistics grch37-control-statistics-VDR.zip
Traceback (most recent call last):
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/bin/pypgx", line 6, in <module>
from pypgx.__main__ import main
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/pypgx/__init__.py", line 1, in <module>
from .api.core import (
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/pypgx/api/core.py", line 10, in <module>
from .. import sdk
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/pypgx/sdk/__init__.py", line 1, in <module>
from .utils import (Archive, add_cn_samples, compare_metadata, simulate_copy_number)
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/pypgx/sdk/utils.py", line 10, in <module>
from fuc import pyvcf, pycov, common, pybam
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/fuc/__init__.py", line 1, in <module>
from .api import *
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/fuc/api/pyvcf.py", line 145, in <module>
from . import pybed, common, pymaf, pybam
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/fuc/api/pybed.py", line 43, in <module>
from . import common
File "/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/fuc/api/common.py", line 25, in <module>
from matplotlib.collections import BrokenBarHCollection
ImportError: cannot import name 'BrokenBarHCollection' from 'matplotlib.collections' (/home/bioinfo-bioch/miniconda3/envs/pypgx/lib/python3.12/site-packages/matplotlib/collections.py)
Best regards, Peter
I did more checking about the error I got initially.
When the error message points the following file :
File "/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/pipeline.py", line 293, in run_ngs_pipeline
cnv_calls = utils.predict_cnv(copy_number, cnv_caller=cnv_caller)
I checked the values of copy_number (got False) and cnv_caller (got None). Next I checked this file :
File "/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/utils.py", line 1242, in predict_cnv
copy_number = _process_copy_number(copy_number)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfo-bioch/miniconda3/lib/python3.12/site-packages/pypgx/api/utils.py", line 157, in _process_copy_number
raise ValueError('Missing values detected')
The error comes from this function :
def _process_copy_number(copy_number):
df = copy_number.data.copy_df()
region = core.get_region(copy_number.metadata['Gene'], assembly=copy_number.metadata['Assembly'])
chrom, start, end = common.parse_region(region)
if (end - start + 1) > copy_number.data.shape[0]:
temp = pd.DataFrame.from_dict({'Temp': range(int(df.Position.iat[0]-1), int(df.Position.iat[-1])+1)})
temp = temp.merge(df, left_on='Temp', right_on='Position', how='outer')
df = temp.drop(columns='Temp')
df = df.fillna(method='ffill')
df = df.fillna(method='bfill')
df.iloc[:, 2:] = df.iloc[:, 2:].apply(lambda c: median_filter(c, size=1000), axis=0)
if df.isnull().values.any():
raise ValueError('Missing values detected')
return sdk.Archive(copy_number.copy_metadata(), pycov.CovFrame(df))
where df
contains only NaN
values.
I'm not sure why, but maybe it has something to do with the ressource bundle, but I installed it how the documentation shows. I also added it in my PATH just in case..
Hope that this can help you.
I recreate every single file I needed to run run-ngs-pipeline
. This time I tried using a .txt file containing every .bam file I want to use. I also set the ID and SM of the .bam files to not just the sample name but the whole file name (including the extension). This way, the tool works just fine even on CYP2D6 and other genes that have SVs.
I'm still not understanding the error I got when creating this issue, but at least I have a bypass solution that works.
Thank you for you help and you time @sbslee
@Cyaneiss Thanks for the update! I'm glad you found the solution on your own. Please feel free to open another Issue if you encounter any problems.
Hello,
I'm trying to use the run-ngs-pipeline tool on different genes. The genes without SV works just fine, but I get this error when trying to run it on CYP2D6.
I'm on Ubuntu 24.04.1 LTS. I installed PyPGx using git clone. The two zip files have been generated with the adapted tools included in pypgx, with the right assembly. The .vcf.gz file was also generated using pypgx, with GRCh28 setup as the assembly.
Here is what happen when running run-ngs-pipeline on TMPT, this seems like the expected behaviour (but maybe it's not ?):
And here is the error message when running it on CYP2D6 :
From what I see, my sklearn version may be a cause of trouble (I have 1.5.2). Do I have to downgrade it to 0.24.2 ? When trying on another conda env to downgrade scklearn (and python), and running the same command, I get the following error message :
If I can provide any further information, I would do so gladly.
Thanks you in advance for your time and help. Best regards, Peter