Closed rpolicastro closed 1 year ago
Thanks @rpolicastro!
@DongzeHE, seems like it’s an issue with pyranges handling the gtf. Any thoughts?
@rpolicastro,
Could you also list your pyroe version?
Thanks! Rob
0.9.0
, so it should be the latest release version as of today.
Cheers!
As a semi-related note, sometimes ENSEMBL GTFs have malformed entries that cause problems with e.g. the Salmon workflow. For those cases I'll run them through AGAT first to fix errors.
I tried doing the same for this GTF file but the error was the same as the one above. This was run with agat v1.0.0.
agat_convert_sp_gff2gtf.pl \
-i Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz \
--gtf_version 2.5 \
-o cleaned_Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz
# This version of AGAT didn't seem to actually zip the file, so I manually did so.
mv cleaned_Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz cleaned_Saccharomyces_cerevisiae.R64-1-1.109.gtf
gzip cleaned_Saccharomyces_cerevisiae.R64-1-1.109.gtf
Hi @rpolicastro,
Ok, I'm trying to reproduce with the following, but so far it seemed to work:
$ conda install pyroe
$ pyroe -v
0.9.0
$ wget "https://ftp.ensembl.org/pub/release-109/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa.gz"
$ wget "https://ftp.ensembl.org/pub/release-109/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz"
$ gunzip Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa.gz
$ mkdir OUTDIR
$ pyroe make-splici Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz 90 OUTDIR
Note that the genome needs to be decompressed for pyroe. Also, I noticed some weird behavior with the alias make-spliced+intronic
that we should fix upstream (but simpleaf
will be using the make-splici
alias anyway). Can you confirm that you get the same problem with the above commands?
Thanks! Rob
This is strange, I get the same error after after using a fresh install of pyroe 0.9.0 and decompressing the assembly.
The mamba environment.
mamba create -n pyroe -c conda-forge -c bioconda pyroe==0.9.0
mamba activate pyroe
Running pyroe directly.
wget "https://ftp.ensembl.org/pub/release-109/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz"
wget "https://ftp.ensembl.org/pub/release-109/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa.gz"
gunzip Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa.gz
mkdir -p OUTDIR
pyroe make-splici Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa Saccharomyces_cerevisiae.R64-1-1.109.gtf.gz 90 OUTDIR
The error (same as last time).
Traceback (most recent call last):
File "/software/miniconda3/envs/pyroe/bin/pyroe", line 254, in <module>
make_splici_txome(
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pyroe/make_txome.py", line 638, in make_splici_txome
introns = gr.features.introns(by="transcript")
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pyranges/genomicfeatures.py", line 254, in introns
result = pyrange_apply(_introns2, by_gr, exons, **kwargs)
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pyranges/multithreaded.py", line 293, in pyrange_apply
result = call_f(function, nparams, df, odf, kwargs)
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pyranges/multithreaded.py", line 23, in call_f
return f.remote(df, odf, **kwargs)
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pyranges/genomicfeatures.py", line 607, in _introns2
introns.Feature.cat.add_categories(["intron"], inplace=True)
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pandas/core/accessor.py", line 112, in f
return self._delegate_method(name, *args, **kwargs)
File "/software/miniconda3/envs/pyroe/lib/python3.10/site-packages/pandas/core/arrays/categorical.py", line 2475, in _delegate_method
res = method(*args, **kwargs)
TypeError: Categorical.add_categories() got an unexpected keyword argument 'inplace'
Sighhhh.... that's gonna make things tough. At this point I'm guessing that maybe the issue has to do with the version of pyranges being pulled in — that is the library being used for GTF parsing and has been the source of issues in the past. This is what I get:
❯ conda list | rg "pyranges"
pyranges 0.0.120 pyh7cba7a3_0 bioconda
Hi @rpolicastro,
Ok, I was able to reproduce this. Right now the key differences seem to be that in the env that reproduces it I am using OSX (rather than linux) and the base install is python 3.10 rather than 3.9.x. I'm thinking the latter one is to blame. @DongzeHE — we should figure out what the problem is upstream here, as we definitely need python 3.10 (and probably 3.11) support. Sigh ...
However, I'll note that before the traceback, I get this message:
WARNING:root: Found records with missing gene_id/gene_name field. These records are reported in OUTDIR/missing_gene_id_or_name_records.gtf. Imputed 10504 missing gene_name using gene_id.
WARNING:root: A clean GTF file with all issues fixed is generated at OUTDIR/clean_gtf.gtf. If needed, please rerun using this clean GTF file.
If I follow the suggestion and then run:
pyroe make-splici Saccharomyces_cerevisiae.R64-1-1.dna_rm.toplevel.fa OUTDIR/clean_gtf.gtf 90 OUTDIR
execution completes successfully. Is the same true for you?
--Rob
Alright, so decompressing my custom genome assembly resolved the original issue I had; the one that prompted this issue and reprex. In hindsight the error message (regarding invalid UTF-8 characters) made sense, since it was trying to read the archive as plain text. It might be worth it to mention in the simpleaf index --help
that the assembly should be decompressed, and to perhaps explicitly check for a compressed file format so there can be a more graceful error.
Now, back to the strangeness of this error. I ran the same workflow for the yeast genome that worked for my custom assembly and got the same error.
As I was typing this I saw the message that you were able to reproduce this. Let me try both rerunning with that cleaned GTF file, and also checking whether downgrading python to 3.9 (for the current 3.10) works.
A few follow-up notes:
clean_gtf.gtf
resulted in the indexing step working for the yeast genome.Ok —— so we are making some progress. Some notes:
(1) Aside from pyroe, we should explicitly check in simpleaf
if the genome is compressed and, if so, simply de-compress it before passing to pyroe (we can clean the decompressed version after successful extraction).
(2) It's good the clean_gtf.gtf
works, that means the issue is definitely related to pyranges' parsing of the original GTF file — though erroring out with a backtrace is not optimal behavior ;P.
(3) Very interesting. I wonder what's different about my environment where it just works? We'll have to investigate that further. It seems it's python 3.9.7, but I really doubt that difference is the key.
Here's one thought / suggestion — do you think a pip install would be any different?
Adding to the strangeness
In case you wanted to compare versions here's everything in my python 3.9.7 environment created via mamba create -n simpleaf -c conda-forge -c bioconda pyroe==0.9.0 simpleaf==0.12.0 piscem==0.6.0 python==3.9.7
.
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
alevin-fry 0.8.1 h9f5acd7_0 bioconda
anndata 0.9.1 pyhd8ed1ab_0 conda-forge
bedtools 2.30.0 h468198e_3 bioconda
biopython 1.81 py39h72bdee0_0 conda-forge
boost-cpp 1.74.0 h6cacc03_7 conda-forge
brotli 1.0.9 h166bdaf_8 conda-forge
brotli-bin 1.0.9 h166bdaf_8 conda-forge
brotlipy 0.7.0 py39hb9d737c_1005 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.12.7 ha878542_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
certifi 2022.12.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py39he91dace_3 conda-forge
charset-normalizer 3.1.0 pyhd8ed1ab_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_0 conda-forge
contourpy 1.0.7 py39h4b4f3f3_0 conda-forge
cryptography 40.0.2 py39h079d5ae_0 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
fonttools 4.39.3 py39h72bdee0_0 conda-forge
freetype 2.12.1 hca18f0e_1 conda-forge
h5py 3.8.0 nompi_py39h89bf01e_101 conda-forge
hdf5 1.14.0 nompi_hb72d44e_103 conda-forge
icu 69.1 h9c3ff4c_0 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
importlib-metadata 6.4.1 pyha770c72_0 conda-forge
importlib-resources 5.12.0 pyhd8ed1ab_0 conda-forge
importlib_metadata 6.4.1 hd8ed1ab_0 conda-forge
importlib_resources 5.12.0 pyhd8ed1ab_0 conda-forge
joblib 1.2.0 pyhd8ed1ab_0 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.4 py39hf939315_1 conda-forge
krb5 1.20.1 h81ceb04_0 conda-forge
lcms2 2.15 haa2dc70_1 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
lerc 4.0.0 h27087fc_0 conda-forge
libaec 1.0.6 hcb278e6_1 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libbrotlicommon 1.0.9 h166bdaf_8 conda-forge
libbrotlidec 1.0.9 h166bdaf_8 conda-forge
libbrotlienc 1.0.9 h166bdaf_8 conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 8.0.1 h588be90_0 conda-forge
libdeflate 1.18 h0b41bf4_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libhwloc 2.8.0 h32351e8_1 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
libjemalloc 5.3.0 hcb278e6_0 conda-forge
libjpeg-turbo 2.1.5.1 h0b41bf4_0 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libllvm11 11.1.0 he0ac6c6_5 conda-forge
libnghttp2 1.52.0 h61bc06f_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libssh2 1.10.0 hf14f497_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libtiff 4.5.0 ha587672_6 conda-forge
libwebp-base 1.3.0 h0b41bf4_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.14 haae042b_4 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
llvmlite 0.39.1 py39h7d9a04d_1 conda-forge
matplotlib-base 3.7.1 py39he190548_0 conda-forge
munkres 1.1.4 pyh9f0ad1d_0 conda-forge
natsort 8.3.1 pyhd8ed1ab_0 conda-forge
ncls 0.0.66 py39hbf8eff0_0 bioconda
ncurses 6.3 h27087fc_1 conda-forge
networkx 3.1 pyhd8ed1ab_0 conda-forge
numba 0.56.4 py39h71a7301_1 conda-forge
numpy 1.23.5 py39h3d75532_0 conda-forge
openjpeg 2.5.0 hfec8fc6_2 conda-forge
openssl 3.1.0 h0b41bf4_0 conda-forge
packaging 23.1 pyhd8ed1ab_0 conda-forge
pandas 2.0.0 py39h2ad29b5_0 conda-forge
patsy 0.5.3 pyhd8ed1ab_0 conda-forge
pillow 9.5.0 py39h7207d5c_0 conda-forge
pip 23.1 pyhd8ed1ab_0 conda-forge
piscem 0.6.0 h52b76fa_0 bioconda
platformdirs 3.2.0 pyhd8ed1ab_0 conda-forge
pooch 1.7.0 pyha770c72_3 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pynndescent 0.5.8 pyh1a96a4e_0 conda-forge
pyopenssl 23.1.1 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
pyranges 0.0.120 pyh7cba7a3_0 bioconda
pyrle 0.0.35 py39hbf8eff0_1 bioconda
pyroe 0.9.0 pyhdfd78af_0 bioconda
pysocks 1.7.1 pyha2e5f31_6 conda-forge
python 3.9.7 hf930737_3_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python-tzdata 2023.3 pyhd8ed1ab_0 conda-forge
python_abi 3.9 3_cp39 conda-forge
pytz 2023.3 pyhd8ed1ab_0 conda-forge
readline 8.2 h8228510_1 conda-forge
requests 2.28.2 pyhd8ed1ab_1 conda-forge
salmon 1.10.1 h7e5ed60_0 bioconda
scanpy 1.9.3 pyhd8ed1ab_0 conda-forge
scikit-learn 1.2.2 py39hd189fd4_1 conda-forge
scipy 1.10.1 py39h7360e5f_0 conda-forge
seaborn 0.12.2 hd8ed1ab_0 conda-forge
seaborn-base 0.12.2 pyhd8ed1ab_0 conda-forge
session-info 1.0.0 pyhd8ed1ab_0 conda-forge
setuptools 67.6.1 pyhd8ed1ab_0 conda-forge
simpleaf 0.12.0 h9f5acd7_0 bioconda
six 1.16.0 pyh6c4a22f_0 conda-forge
sorted_nearest 0.0.37 py39hbf8eff0_0 bioconda
sqlite 3.40.0 h4ff8645_0 conda-forge
statsmodels 0.13.5 py39h2ae25f5_2 conda-forge
stdlib-list 0.8.0 pyhd8ed1ab_0 conda-forge
tabulate 0.9.0 pyhd8ed1ab_1 conda-forge
tbb 2021.7.0 h924138e_1 conda-forge
threadpoolctl 3.1.0 pyh8a188c0_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tqdm 4.65.0 pyhd8ed1ab_1 conda-forge
typing-extensions 4.5.0 hd8ed1ab_0 conda-forge
typing_extensions 4.5.0 pyha770c72_0 conda-forge
tzdata 2023c h71feb2d_0 conda-forge
umap-learn 0.5.3 py39hf3d152e_0 conda-forge
unicodedata2 15.0.0 py39hb9d737c_0 conda-forge
urllib3 1.26.15 pyhd8ed1ab_0 conda-forge
wheel 0.40.0 pyhd8ed1ab_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zipp 3.15.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
zstd 1.5.2 h3eb15da_6 conda-forge
I am now very curious about the local version that works without the clean gtf! I am also curious what it is about the ensemble gtf that breaks pyranges!
You probably hit on some holy combination of versions to counter the unholy mess that is ENSEMBL GTFs 😅
So here is what I have in my pip3
(python 3.9.7) install (investigated via pipdeptree):
pyroe==0.9.0
- biopython [required: >=1.77, installed: 1.79]
- numpy [required: Any, installed: 1.21.5]
- packaging [required: >=21.0, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pandas [required: >=1.3.0, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- pyranges [required: >=0.0.120, installed: 0.0.120]
- cython [required: Any, installed: 0.29.28]
- natsort [required: Any, installed: 8.1.0]
- ncls [required: >=0.0.63, installed: 0.0.64]
- numpy [required: Any, installed: 1.21.5]
- pandas [required: Any, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- pyrle [required: Any, installed: 0.0.34]
- cython [required: Any, installed: 0.29.28]
- natsort [required: Any, installed: 8.1.0]
- numpy [required: Any, installed: 1.21.5]
- pandas [required: Any, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- tabulate [required: Any, installed: 0.8.9]
- sorted-nearest [required: >=0.0.33, installed: 0.0.33]
- cython [required: Any, installed: 0.29.28]
- numpy [required: Any, installed: 1.21.5]
- tabulate [required: Any, installed: 0.8.9]
- scanpy [required: >=1.8.2, installed: 1.8.2]
- anndata [required: >=0.7.4, installed: 0.8.0]
- h5py [required: >=3, installed: 3.6.0]
- numpy [required: >=1.14.5, installed: 1.21.5]
- natsort [required: Any, installed: 8.1.0]
- numpy [required: >=1.16.5, installed: 1.21.5]
- packaging [required: >=20, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pandas [required: >=1.1.1, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- scipy [required: >1.4, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- h5py [required: >=2.10.0, installed: 3.6.0]
- numpy [required: >=1.14.5, installed: 1.21.5]
- joblib [required: Any, installed: 1.1.0]
- matplotlib [required: >=3.1.2, installed: 3.5.1]
- cycler [required: >=0.10, installed: 0.11.0]
- fonttools [required: >=4.22.0, installed: 4.31.2]
- kiwisolver [required: >=1.0.1, installed: 1.4.2]
- numpy [required: >=1.17, installed: 1.21.5]
- packaging [required: >=20.0, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pillow [required: >=6.2.0, installed: 9.0.1]
- pyparsing [required: >=2.2.1, installed: 3.0.7]
- python-dateutil [required: >=2.7, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- natsort [required: Any, installed: 8.1.0]
- networkx [required: >=2.3, installed: 2.7.1]
- numba [required: >=0.41.0, installed: 0.55.1]
- llvmlite [required: >=0.38.0rc1,<0.39, installed: 0.38.0]
- numpy [required: >=1.18,<1.22, installed: 1.21.5]
- setuptools [required: Any, installed: 63.2.0]
- numpy [required: >=1.17.0, installed: 1.21.5]
- packaging [required: Any, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pandas [required: >=0.21, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- patsy [required: Any, installed: 0.5.2]
- numpy [required: >=1.4, installed: 1.21.5]
- six [required: Any, installed: 1.16.0]
- scikit-learn [required: >=0.22, installed: 1.0.2]
- joblib [required: >=0.11, installed: 1.1.0]
- numpy [required: >=1.14.6, installed: 1.21.5]
- scipy [required: >=1.1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- threadpoolctl [required: >=2.0.0, installed: 3.1.0]
- scipy [required: >=1.4, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- seaborn [required: Any, installed: 0.11.2]
- matplotlib [required: >=2.2, installed: 3.5.1]
- cycler [required: >=0.10, installed: 0.11.0]
- fonttools [required: >=4.22.0, installed: 4.31.2]
- kiwisolver [required: >=1.0.1, installed: 1.4.2]
- numpy [required: >=1.17, installed: 1.21.5]
- packaging [required: >=20.0, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pillow [required: >=6.2.0, installed: 9.0.1]
- pyparsing [required: >=2.2.1, installed: 3.0.7]
- python-dateutil [required: >=2.7, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- numpy [required: >=1.15, installed: 1.21.5]
- pandas [required: >=0.23, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- scipy [required: >=1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- sinfo [required: Any, installed: 0.3.4]
- stdlib-list [required: Any, installed: 0.8.0]
- statsmodels [required: >=0.10.0rc2, installed: 0.13.2]
- numpy [required: >=1.17, installed: 1.21.5]
- packaging [required: >=21.3, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- pandas [required: >=0.25, installed: 1.4.1]
- numpy [required: >=1.18.5, installed: 1.21.5]
- python-dateutil [required: >=2.8.1, installed: 2.8.2]
- six [required: >=1.5, installed: 1.16.0]
- pytz [required: >=2020.1, installed: 2022.1]
- patsy [required: >=0.5.2, installed: 0.5.2]
- numpy [required: >=1.4, installed: 1.21.5]
- six [required: Any, installed: 1.16.0]
- scipy [required: >=1.3, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- tables [required: Any, installed: 3.7.0]
- numexpr [required: >=2.6.2, installed: 2.8.1]
- numpy [required: >=1.13.3, installed: 1.21.5]
- packaging [required: Any, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- numpy [required: >=1.19.0, installed: 1.21.5]
- packaging [required: Any, installed: 21.3]
- pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.7]
- tqdm [required: Any, installed: 4.63.1]
- umap-learn [required: >=0.3.10, installed: 0.5.2]
- numba [required: >=0.49, installed: 0.55.1]
- llvmlite [required: >=0.38.0rc1,<0.39, installed: 0.38.0]
- numpy [required: >=1.18,<1.22, installed: 1.21.5]
- setuptools [required: Any, installed: 63.2.0]
- numpy [required: >=1.17, installed: 1.21.5]
- pynndescent [required: >=0.5, installed: 0.5.6]
- joblib [required: >=0.11, installed: 1.1.0]
- llvmlite [required: >=0.30, installed: 0.38.0]
- numba [required: >=0.51.2, installed: 0.55.1]
- llvmlite [required: >=0.38.0rc1,<0.39, installed: 0.38.0]
- numpy [required: >=1.18,<1.22, installed: 1.21.5]
- setuptools [required: Any, installed: 63.2.0]
- scikit-learn [required: >=0.18, installed: 1.0.2]
- joblib [required: >=0.11, installed: 1.1.0]
- numpy [required: >=1.14.6, installed: 1.21.5]
- scipy [required: >=1.1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- threadpoolctl [required: >=2.0.0, installed: 3.1.0]
- scipy [required: >=1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- scikit-learn [required: >=0.22, installed: 1.0.2]
- joblib [required: >=0.11, installed: 1.1.0]
- numpy [required: >=1.14.6, installed: 1.21.5]
- scipy [required: >=1.1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- threadpoolctl [required: >=2.0.0, installed: 3.1.0]
- scipy [required: >=1.0, installed: 1.8.0]
- numpy [required: >=1.17.3,<1.25.0, installed: 1.21.5]
- tqdm [required: Any, installed: 4.63.1]
@rpolicastro,
OK, I think I figured it out! It's because pyranges doesn't guard against pandas 2.0 (which is obviously a major version bump, very new, and introduces several breaking changes).
So, the real solution is for them to fix the incompatibility upstream. However, the temporary solution is to force a pandas < 2.0. This worked for me:
mamba create -n pyroe -c conda-forge -c bioconda pyroe==0.9.0 pandas==1.5.3
And then we should specify this requirement upstream in bioconda recipe and in the pyproject.toml. Please let me know if this works for you.
--Rob
I can confirm that downgrading to pandas 1.5.3 fixed the error 😀
Excellent. I've filed the bug report upstream, and am pushing a 0.9.1 of pyroe with the <2.0 restriction on pandas (which also fixes the sub-command aliasing issue). I'll close this for the time being, but hopefully we can get the underlying issue with pyranges fixed upstream (until we get a chance to re-implement the splici/spliceu extraction directly in rust 😉 @DongzeHE).
Hi!
I was encountering an error when running
simpleaf index
with a custom genome, so tried making a reprex. I encountered a different error with this reprex so figured we could work through this first.relevant versions:
I'll use the ENSEMBL bakers yeast genome for the reprex.
Preparing to run the indexing step.
Running the indexing.
Resulting error.
Cheers, Bob