GoekeLab / xpore

Identification of differential RNA modifications from nanopore direct RNA sequencing
https://xpore.readthedocs.io/
MIT License
133 stars 22 forks source link

Error in xpore-dataprep step #24

Closed madnessfish closed 4 years ago

madnessfish commented 4 years ago

I have encountered some problems while running the xpore-dataprep step. I have mapped the raw DRS reads to human reference transcriptome with minimap2. I wondered if it is related to the read_index in the summary file. Thank you so much!

eventalign.txt eventalign

summary.txt summary

command xpore-dataprep --eventalign eventalign.txt --summary summary.txt --out_dir dataprep

It raise the error below. Process Consumer-1: Traceback (most recent call last): File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/helper.py", line 77, in run result = self.task_function(*next_task_args,self.locks) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/dataprep.py", line 78, in combine df2write = df_events_per_read.loc[[tx_id,read_id],:].reset_index() File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 873, in getitem return self._getitem_tuple(key) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple return self._getitem_lowerdim(tup) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim return self._getitem_nested_tuple(tup) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 847, in _getitem_nested_tuple obj = getattr(obj, self.name)._getitem_axis(key, axis=axis) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 1099, in _getitem_axis return self._getitem_iterable(key, axis=axis) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 1037, in _getitem_iterable keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexing.py", line 1240, in _get_listlike_indexer indexer, keyarr = ax._convert_listlike_indexer(key) File "/exeh_2/sin/bin/miniconda3/envs/xpore/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 2397, in _convert_listlike_indexer raise KeyError(f"{keyarr[mask]} not in index") KeyError: "['29dfb566-2dbe-4b8f-8c3e-ba2f6d1403e9'] not in index"

ploy-np commented 4 years ago

Hi @madnessfish, May I ask what version of xpore do you use? We encourage you to use xpore v0.5.2.

pabloacera commented 4 years ago

Hi! I am running the demo code with the 0.5.2 version and still get a similar error

(nanopore) labuser@JCSMR-049555LD:~/lib/xpore/demo/data/HEK293T-WT-rep1$ xpore-dataprep --eventalign nanopolish/eventalign.txt --summary nanopolish/summary.txt --out_dir dataprep --genome
Process Consumer-1:
Traceback (most recent call last):
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/xpore-0.5.2-py3.7.egg/xpore/scripts/helper.py", line 77, in run
    result = self.task_function(*next_task_args,self.locks)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/xpore-0.5.2-py3.7.egg/xpore/scripts/dataprep.py", line 78, in combine
    df2write = df_events_per_read.loc[[tx_id,read_id],:].reset_index()
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 873, in __getitem__
    return self._getitem_tuple(key)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 847, in _getitem_nested_tuple
    obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1099, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1037, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1240, in _get_listlike_indexer
    indexer, keyarr = ax._convert_listlike_indexer(key)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2397, in _convert_listlike_indexer
    raise KeyError(f"{keyarr[mask]} not in index")
KeyError: "['decfc830-9eb6-4a10-97fb-d117db06178d'] not in index"

Thanks

ploy-np commented 4 years ago

Hi @pabloacera, @madnessfish,

Because from v0.5.1 to v0.5.2, I changed the data structure in eventalign.hdf5, one of the outputs from xpore-dataprep. Can you check if you removed this file before you run xpore-dataprep? If you still have the error, please let me know.

pabloacera commented 4 years ago

Hi, Thanks for the answer. I currently have no eventalign.hdf5, this is my file structure:

bamtx/ fast5/ fastq/ nanopolish/

this is my command.


(nanopore) labuser@JCSMR-049555LD:~/lib/xpore/demo/data/HEK293T-METTL3-KO-rep1$ xpore-dataprep --eventalign nanopolish/eventalign.txt --summary nanopolish/summary.txt --out_dir dataprep --genome
Process Consumer-1:
Traceback (most recent call last):
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/xpore-0.5.2-py3.7.egg/xpore/scripts/helper.py", line 77, in run
    result = self.task_function(*next_task_args,self.locks)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/xpore-0.5.2-py3.7.egg/xpore/scripts/dataprep.py", line 78, in combine
    df2write = df_events_per_read.loc[[tx_id,read_id],:].reset_index()
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 873, in __getitem__
    return self._getitem_tuple(key)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 847, in _getitem_nested_tuple
    obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1099, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1037, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexing.py", line 1240, in _get_listlike_indexer
    indexer, keyarr = ax._convert_listlike_indexer(key)
  File "/home/labuser/anaconda3/envs/nanopore/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2397, in _convert_listlike_indexer
    raise KeyError(f"{keyarr[mask]} not in index")
KeyError: "['e2e8889a-e12c-4bb9-b5c6-142b1246d31f'] not in index"

thanks!

ploy-np commented 4 years ago

Hi @pabloacera, @madnessfish,

I mad a mistake! xpore v0.5.2 is not the latest update relative to my github branch yet. That's why I didn't have the error in my local machine.

Now, I published a latest version (xpore v0.5.3). Could you please update and try again?

Apologies for this!

pabloacera commented 4 years ago

That's completely fine! thanks a lot for taking the time! =)

pabloacera commented 4 years ago

Hi, Somehow it was still failing for me. I made a simple change in the script xpore-dataprep, adding parenthesis to line 79. df2write = df_events_per_read.loc[[(tx_id,read_id)],:].reset_index() I have already made a pull request just in case, probably you fixed it in another version. Thanks again and sorry for the inconveniences

ploy-np commented 4 years ago

Hi @pabloacera, Thank you so much! I will test and merge your comment in the future version. So, after you added the parenthesis, everything works fine?

pabloacera commented 4 years ago

exactly! for me after that, xpore-dataprep worked fine and then xpore-diffmod --config Hek293T_config.yml also worked! Cheers!

ploy-np commented 4 years ago

Thanks a lot @pabloacera

ploy-np commented 4 years ago

This was fixed in the release v0.5.4 already. May I close this comment. Thank you very much @pabloacera @madnessfish.