aristoteleo / dynast-release

Inclusive and efficient quantification of labeling and splicing RNAs for time-resolved metabolic labeling based scRNA-seq experiments
https://dynast-release.readthedocs.io/en/latest/
MIT License
15 stars 4 forks source link

Exception on dynast estimate - cell barcode names #1

Closed jmartinrufino closed 2 years ago

jmartinrufino commented 3 years ago

Hi,

After processing my using dynast counts, I used dynast estimate using the --groups argument.

I obtained a list of the cell barcodes of interest from the anndata object generated by dynast counts, and saved it as a csv with the following format:

AAAAAAAAAAGG,group_1 AAAAAGACCAAT,group_1 AAAACATTGGGC,group_1 AAAACCAAGTGA,group_1 AAAACCAGCGTG,group_1 AAAACCTGAGTA,group_1 AAAACTACTCCC,group_1 AAAACTCCGGGT,group_1 AAAAGCTGCATG,group_1 AAAAGTTTCAGG,group_1

I am currently getting the following error at the last step of dynast estimate, during anndata object generation:

ERROR [main] An exception occurred
Traceback (most recent call last):
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 706, in astype
    casted = self._values.astype(dtype, copy=copy)
ValueError: invalid literal for int() with base 10: 'AAAAAAAAAAGG'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/dynast/main.py", line 665, in main
    COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=args.tmp)
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/dynast/main.py", line 578, in parse_estimate
    estimate(
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/dynast/logging.py", line 53, in inner
    return func(*args, **kwargs)
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/dynast/estimate.py", line 224, in estimate
    adata.obs['count_dir'] = adata.obs.index.str.split('-').str[-1].astype(int).map({
  File "/home/jmartinr/anaconda3/envs/dynast_env/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 708, in astype
    raise TypeError(
TypeError: Cannot cast Index to dtype int64

Based on the error, it seems the issue is caused by the formatting of the cell barcodes, which expects a "-1" at the end of their names. However, dynast counts does not produce cell barcodes with "-1" at the end.

A single input directory is being used, as well as a single group of cells on the CSV (I want to use dynast estimate on all cells of the file together).

Thanks, Jorge

Xiaojieqiu commented 3 years ago

thanks Jorge! just another minor point to @Lioscro , in order to use adata.obs.index.str.split('-').str[-1].astype(int) the pandas version seems need to be a certain version, 1.2.2 or above works for me. you may specify the version for pandas in your requirement.txt file

Lioscro commented 3 years ago

Hi, @jmartinrufino, I've just pushed an update that I believe should fix the issue. Could you try again and let me know if it works now? (for the sake of testing, you may want to truncate your csv so that only a subset of cells are used)

jmartinrufino commented 3 years ago

Thanks both for the prompt response! It is working perfectly now with the latest fix. Thanks!!

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days