jiyuuchc / chioso

GNU General Public License v3.0
2 stars 0 forks source link

error while running chioso step1 #1

Open shrutikhare-git opened 3 months ago

shrutikhare-git commented 3 months ago

I am trying to run chioso on an Ubuntu machine. What does the following error mean? Is it about h5ad formatting? (I converted from RDS to h5ad) Could you share the desired format for gene list? I currently have it as a text file with 1 gene name per line. Thanks!

command - python -m chioso.pp-ref --data data.h5ad --genes gene_list --outdir out

ERROR - 2024-06-26 18:30:52.066995: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-06-26 18:30:52.659633: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT I0626 18:30:53.861779 140163648001856 pp-ref.py:32] Read gene names from gene_list Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/site-packages/chioso/pp-ref.py", line 82, in app.run(main) File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) ^^^^^^^^^^ File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/site-packages/chioso/pp-ref.py", line 78, in main process_h5ad() File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/site-packages/chioso/pp-ref.py", line 34, in process_h5ad genes = json.load(f) ^^^^^^^^^^^^ File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/json/init.py", line 293, in load return loads(fp.read(), ^^^^^^^^^^^^^^^^ File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/dglab/software/miniforge/envs/chioso/lib/python3.11/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

jiyuuchc commented 3 months ago

Apologies for the misleading documentation.

The gene_list file should follow JSON format, i.e., it should look like this:

["4930594M17Rik", "Gm2832", "Enpp5", "Gm9837", ....]

shrutikhare-git commented 3 months ago

Thanks a lot for your prompt reply. I now have a new error - KeyError: 'celltype' I guess my object needs to have a column named 'celltype'? (Mine is currently called Broad_CellType. Is there any way to specify that instead?) TIA.

jiyuuchc commented 3 months ago

Attention: This is an external email. Use caution responding, opening attachments or clicking on links.

You can also override in command line: --col=Broad_CellType


From: shruti @.***> Sent: Wednesday, June 26, 2024 10:10 AM To: jiyuuchc/chioso Cc: Yu,Ji; Comment Subject: Re: [jiyuuchc/chioso] error while running chioso step1 (Issue #1)

Attention: This is an external email. Use caution responding, opening attachments or clicking on links.

Thanks a lot for your prompt reply. I now have a new error - KeyError: 'celltype' I guess my object needs to have a column named 'celltype'? (Mine is currently called Broad_CellType. Is there any way to specify that instead?) TIA.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/jiyuuchc/chioso/issues/1*issuecomment-2191813074__;Iw!!Cn_UX_p3!lXEQqHDp3Bi31eIJvyvSAu19-36U4g5w2XjEQq7BvmIpsqmpbK_jMJ0lvrA2yjH7D7l5Q42Vni_0-bNAkvZO1A$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAKRPNWMNYRYZMR5OQBFNKLZJLDWNAVCNFSM6AAAAABJ5WMZPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJRHAYTGMBXGQ__;!!Cn_UX_p3!lXEQqHDp3Bi31eIJvyvSAu19-36U4g5w2XjEQq7BvmIpsqmpbK_jMJ0lvrA2yjH7D7l5Q42Vni_0-bPM9wn-fQ$. You are receiving this because you commented.Message ID: @.***>

shrutikhare-git commented 1 month ago

Hi again, I am currently stuck with this vague error - 'AttributeError: 'numpy.ndarray' object has no attribute 'indices'' - while running step 1. I am using this command on an Ubuntu machine - 'python -m chioso.pp-ref --data /path/object.h5ad --genes /path/gene_list_test --outdir /test --col="Broad_CellType"

I was wondering at which stage do I need to specify the spatial data in space-deliminated text format with four feature columns: gene, x, y, counts? Thanks!

jiyuuchc commented 1 month ago

Attention: This is an external email. Use caution responding, opening attachments or clicking on links.

Shruti,

The error is because your h5ad file is created with a dense array instead of sparse array.

The fact that your reference sequencing data can be stored with a dense array indicated that this is a fairy small dataset.

Chioso is created specifically to handle very large datasets, which current tools are having difficulties with. Thus the code assumed that the files were created with sparse arrays.

I've committed a patch to allow dense arrays, so your command should pass now if you update your installation to the head version

However, be warned that you may not find chioso to have any significant advantage if you are dealing with relatively small amount of data.

Regarding your second question: I wonder if you missed the bottom half of the step-1 code block in the README:

python -m chioso.pp-spatial --data --genes --outdir

This can be run independently of the reference data preprocessing. Typically you also run this multiple times, because people tend to save data of multiple samples in different files.

Ji


From: shruti @.***> Sent: Tuesday, August 20, 2024 3:53 AM To: jiyuuchc/chioso Cc: Yu,Ji; Comment Subject: Re: [jiyuuchc/chioso] error while running chioso step1 (Issue #1)

Attention: This is an external email. Use caution responding, opening attachments or clicking on links.

Hi again, I am currently stuck with this vague error - 'AttributeError: 'numpy.ndarray' object has no attribute 'indices'' - while running step 1. I am using this command on an Ubuntu machine - 'python -m chioso.pp-ref --data /path/object.h5ad --genes /path/gene_list_test --outdir /test --col="Broad_CellType"

I was wondering at which stage do I need to specify the spatial data in space-deliminated text format with four feature columns: gene, x, y, counts? Thanks!

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/jiyuuchc/chioso/issues/1*issuecomment-2298205633__;Iw!!Cn_UX_p3!itBI12coeTg2iNcFS_11PChQC5lzmoY9jlhSvMh8OcJV4Wa8iM5CRPTzZWC9eufwvDBIF_XztW3h13D1pBg_zA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAKRPNWM45AAEZLEMV4YM7DZSLYYVAVCNFSM6AAAAABJ5WMZPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJYGIYDKNRTGM__;!!Cn_UX_p3!itBI12coeTg2iNcFS_11PChQC5lzmoY9jlhSvMh8OcJV4Wa8iM5CRPTzZWC9eufwvDBIF_XztW3h13Dz-7ufXA$. You are receiving this because you commented.Message ID: @.***>