getzlab / SignatureAnalyzer

Updated SignatureAnalyzer-GPU with mutational spectra & RNA expression compatibility.
MIT License
71 stars 21 forks source link

TypeError: Passing a dict as an indexer is not supported. Use a list instead. #51

Closed daz10000 closed 1 year ago

daz10000 commented 1 year ago

I am using the current source checked out of git in the last week, Python 3.11 and a current version of Pandas 2.0.3 , and I'm getting a strange type era which seems to be due to signatureanalyzer passing in a dictionary instead of a list to Pandas. I can't completely rule out something else being off (I am porting somebody else's code to a different environment and it did work in some past combination of packages but enough has changed. That said, Pandas does seem to have a reasonable complaint here

<calling code>
  File "/home/ec2-user/src/getzlab-SignatureAnalyzer/signatureanalyzer/spectra.py", line 120, in get_spectra_from_maf
    spectra = spectra.loc[context_use]
              ~~~~~~~~~~~^^^^^^^^^^^^^
  File "/home/ec2-user/.local/lib/python3.11/site-packages/pandas/core/indexing.py", line 1091, in __getitem__
    check_dict_or_set_indexers(key)
  File "/home/ec2-user/.local/lib/python3.11/site-packages/pandas/core/indexing.py", line 2627, in check_dict_or_set_indexers
    raise TypeError(
TypeError: Passing a dict as an indexer is not supported. Use a list instead.

I had the library print out context_use and it is a 96 member dictionary of the four base pay sequences onto the integers 1..96. I presume that just the keys or values are supposed to be used for the index. I'll experiment with the library but I don't know if this previously worked or I have some other problem upstream. Any thoughts appreciated

context_use={'ACAA': 1, 'ACAC': 2, 'ACAG': 3, 'ACAT': 4, 'ACCA': 5, 'ACCC': 6, 'ACCG': 7, 'ACCT': 8, 'ACGA': 9, 'ACGC': 10, 'ACGG': 11, 'ACGT': 12, 'ACTA': 13, 'ACTC': 14, 'ACTG': 15, 'ACTT': 16, 'AGAA': 17, 'AGAC': 18, 'AGAG': 19, 'AGAT': 20, 'AGCA': 21, 'AGCC': 22, 'AGCG': 23, 'AGCT': 24, 'AGGA': 25, 'AGGC': 26, 'AGGG': 27, 'AGGT': 28, 'AGTA': 29, 'AGTC': 30, 'AGTG': 31, 'AGTT': 32, 'ATAA': 33, 'ATAC': 34, 'ATAG': 35, 'ATAT': 36, 'ATCA': 37, 'ATCC': 38, 'ATCG': 39, 'ATCT': 40, 'ATGA': 41, 'ATGC': 42, 'ATGG': 43, 'ATGT': 44, 'ATTA': 45, 'ATTC': 46, 'ATTG': 47, 'ATTT': 48, 'CAAA': 49, 'CAAC': 50, 'CAAG': 51, 'CAAT': 52, 'CACA': 53, 'CACC': 54, 'CACG': 55, 'CACT': 56, 'CAGA': 57, 'CAGC': 58, 'CAGG': 59, 'CAGT': 60, 'CATA': 61, 'CATC': 62, 'CATG': 63, 'CATT': 64, 'CGAA': 65, 'CGAC': 66, 'CGAG': 67, 'CGAT': 68, 'CGCA': 69, 'CGCC': 70, 'CGCG': 71, 'CGCT': 72, 'CGGA': 73, 'CGGC': 74, 'CGGG': 75, 'CGGT': 76, 'CGTA': 77, 'CGTC': 78, 'CGTG': 79, 'CGTT': 80, 'CTAA': 81, 'CTAC': 82, 'CTAG': 83, 'CTAT': 84, 'CTCA': 85, 'CTCC': 86, 'CTCG': 87, 'CTCT': 88, 'CTGA': 89, 'CTGC': 90, 'CTGG': 91, 'CTGT': 92, 'CTTA': 93, 'CTTC': 94, 'CTTG': 95, 'CTTT': 96}

pardon my debugging code, but the piece that breaks is

 maf[context_form] = contig
        spectra = maf.groupby([context_form, 'sample']).size().unstack().fillna(0).astype(int)
        for c in context_use:
            if c not in spectra.index:
                spectra.loc[c] = 0
        print(f"XXX: context_use={context_use}")
        print(f"XXX: spectra cols = {spectra.columns}")
        print(f"XXX: spectra rows = {spectra.shape[0]}")
        spectra = spectra.loc[context_use]

adjusting the last line to pass in context_use.keys() allows it to run - I'm looking to see if that affects the output. Thoughts appreciated.

yoakiyama commented 1 year ago

Thanks for bringing this to our attention! I think this is a pandas versioning issue. We edited the code to use context_use.keys(), so it should be working as intended now. Thanks again!