zktuong / dandelion

dandelion - A single cell BCR/TCR V(D)J-seq analysis package for 10X Chromium 5' data
https://sc-dandelion.readthedocs.io/
GNU Affero General Public License v3.0
95 stars 24 forks source link

Singularity Container Preprocessing Error #368

Open bpr4242 opened 2 months ago

bpr4242 commented 2 months ago

Description of the bug

Hi Zewen, Great package and love the container you made! I was trying to do preprocessing manually myself but had some issues with file paths. So I decided to use your container, and used the preprocessing function. But there was an issue I think with how I labeled the individual column. Stupidly I didnt think to make it lead with a alphabetical character and I think the container read it in as an int64( if I am understanding this correctly). Maybe a typing assignment when importing in preprocessing.py would help? Will try again with the individual changed to ms'3'. Thanks and let me know if you need anything else!

Minimal reproducible example

sample,prefix,individual
3s_bcr,3s,3
4s_bcr,4s,4
5s_bcr,5s,5
6s_bcr,6s,6
7s_bcr,7s,7
8s_bcr,8s,8
3b_bcr,3b,3
4b_bcr,4b,4
5b_bcr,5b,5
6b_bcr,6b,6
7b_bcr,7b,7
8b_bcr,8b,8

apptainer run -B $PWD ~/kt16_default_sc-dandelion.sif dandelion-preprocess --org=mouse --filter_to_high_confidence --meta ./sample_info.csv

The error message produced by the code above

Traceback (most recent call last):
  File "/share/dandelion_preprocess.py", line 378, in <module>
    main()
  File "/share/dandelion_preprocess.py", line 288, in main
    ddl.pp.reassign_alleles(
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/site-packages/dandelion/preprocessing/_preprocessing.py", line 1439, in reassign_alleles
    out_dir = Path(combined_folder)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 871, in __new__
    self = cls._from_parts(args)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 509, in _from_parts
    drv, root, parts = self._parse_args(args)
                       ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 493, in _parse_args
    a = os.fspath(a)
        ^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not int64

OS information

MacOS container most recent

Version information

command line

Additional context

No response

zktuong commented 2 months ago

Hi @bpr4242 thanks! yes you are right it's a problem with the way numbers are interpreted by default with pandas.read_csv i would normally never name files/folders as numbers as it causes issues like this. so yea changing to an actual string should work.