novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
110 stars 31 forks source link

An error reprot #92

Closed chaigsh closed 3 years ago

chaigsh commented 3 years ago

Hello,

When I use Epinano_Variants.py in EpiNano 1.2 to extract base-calling error features, an error occured. The code is below:

epinano="/public/work/guoshi/bin/EpiNano-master/" sam2tsv="/public/work/guoshi/bin/jvarkit-master/dist/sam2tsv.jar" ref="/public/data/guoshi/reference/mouse/GENCODE/" bamfile="/public/work/guoshi/data_processing/long-reads/nanopore/alignment/"

python ${epinano}Epinano_Variants.py -R ${ref}gencode.vM26.transcripts.fa --bam ${bamfile}meswt_minimap_transcriptome.filt.sort.bam --sam2tsv ${sam2tsv} --type t

The error is: Process Process-2: Traceback (most recent call last): File "/public/work/guoshi/bin/anaconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/public/work/guoshi/bin/anaconda3/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 44, in split_tsv_for_per_site_var_freq head = next(tsv) StopIteration

Thank you,

Regards

Huanle commented 3 years ago

Hi @chaigsh ,

Can you try with the sam2tsv in the misc/ folder? Thanks.

chaigsh commented 3 years ago

Hi @Huanle Yes, I try it. But other errors occured. When parameter "--number_cpus 1" was used, it is ok. However, it runs slowly. Thanks.

Huanle commented 3 years ago

Hi @chaigsh

What is the other error? If it is too slow for you, can you split your bam file into smaller ones? samtools view -hb big.bam {reference_sequence_id} > {reference_sequence_id} .bam would work. Then try running it on the small bam files.

chaigsh commented 3 years ago

Hi @Huanle The parameter "--number_cpus 1" has no results returned. if not using "--number_cpus 1" , some errors occured.

/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py:13: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as tm /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it Traceback (most recent call last): File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 137, in raise_on_meta_error yield File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3583, in _emulate return func(*_extract_meta(args, True), **_extract_meta(kwargs, True)) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in _extract_meta return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3563, in _extract_meta return x._meta_nonempty if nonempty else x._meta File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 322, in _meta_nonempty return meta_nonempty(self._meta) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 422, in meta_nonempty idx = _nonempty_index(x.index) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 348, in _nonempty_index return pd.MultiIndex(levels=levels, labels=labels, names=idx.names) TypeError: new() got an unexpected keyword argument 'labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 352, in main() File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 348, in main tsv_to_var (tsvit, tmp_dir, out_var_fn, args.number_cpus) File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 337, in tsv_to_var df_proc (df, out_var_fn) File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 221, in df_proc gb = gb.reset_index() File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 431, in reset_index return self.map_partitions(M.reset_index, drop=drop).clear_divisions() /public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py:13: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.tes ting instead. import pandas.util.testing as tm /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it /public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py:13: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.tes ting instead. import pandas.util.testing as tm /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it /public/work/guoshi/data_processing/long-reads/nanopore/alignment/meswt_minimap_transcriptome.filt.sortTMP already exists, will overwrite it Traceback (most recent call last): File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 137, in raise_on_meta_error yield File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3583, in _emulate return func(*_extract_meta(args, True), **_extract_meta(kwargs, True)) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in _extract_meta return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3563, in _extract_meta return x._meta_nonempty if nonempty else x._meta File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 322, in _meta_nonempty return meta_nonempty(self._meta) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 422, in meta_nonempty idx = _nonempty_index(x.index) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 348, in _nonempty_index return pd.MultiIndex(levels=levels, labels=labels, names=idx.names) TypeError: new() got an unexpected keyword argument 'labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 352, in main() File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 348, in main tsv_to_var (tsvit, tmp_dir, out_var_fn, args.number_cpus) File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 337, in tsv_to_var df_proc (df, out_var_fn) File "/public/work/guoshi/bin/EpiNano-master/Epinano_Variants.py", line 221, in df_proc gb = gb.reset_index() File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 431, in reset_index return self.map_partitions(M.reset_index, drop=drop).clear_divisions() File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 581, in map_partitions return map_partitions(func, self, *args, kwargs) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3619, in map_partitions meta = _emulate(func, *args, udf=True, *kwargs) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3583, in _emulate return func(_extract_meta(args, True), _extract_meta(kwargs, True)) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/contextlib.py", line 130, in exit self.gen.throw(type, value, traceback) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 154, in raise_on_meta_error raise ValueError(msg) ValueError: Metadata inference failed in reset_index.

You have supplied a custom function and Dask is unable to determine the type of output that that function returns.

To resolve this please provide a meta= keyword. The docstring of the Dask function you ran should have more information.

Original error is below:

TypeError("new() got an unexpected keyword argument 'labels'")

Traceback:

File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 137, in raise_on_meta_error yield File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3583, in _emulate return func(*_extract_meta(args, True), **_extract_meta(kwargs, True)) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in _extract_meta return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3567, in return tuple([_extract_meta(_x, nonempty) for _x in x]) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 3563, in _extract_meta return x._meta_nonempty if nonempty else x._meta File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 322, in _meta_nonempty return meta_nonempty(self._meta) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 422, in meta_nonempty idx = _nonempty_index(x.index) File "/public/work/guoshi/bin/anaconda3/lib/python3.7/site-packages/dask/dataframe/utils.py", line 348, in _nonempty_index return pd.MultiIndex(levels=levels, labels=labels, names=idx.names)

Huanle commented 3 years ago

Hi @chaigsh ,

thanks for sharing the error message. can you also check the version of the packages that you have installed and see if they are the same as required ?

Huanle commented 3 years ago

Hi @chaigsh ,

can you try with the docker image that I mentioned in #94 ? I think that will solve your issue.