saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
307 stars 50 forks source link

Keyerror in srr_to_gsm function in sraweb.py #24

Closed dibyaaaaax closed 4 years ago

dibyaaaaax commented 4 years ago

Description

calling srr_to_gsm function in sraweb throws KeyError.

What I Did

Command run:

sc = SRAweb()
sc.srr_to_gsm("SRR057515")

Traceback:

Traceback (most recent call last):
  File "test_sraweb.py", line 113, in <module>
    test_srr_to_gsm(sc)
  File "test_sraweb.py", line 109, in test_srr_to_gsm
    df = sraweb_connection.srr_to_gsm("SRR057513")
  File "/home/dibya/Documents/new_pysradb/pysradb/pysradb/sraweb.py", line 640, in srr_to_gsm
    return _order_first(joined_df, ["run_accession", "experiment_alias"])
  File "/home/dibya/Documents/new_pysradb/pysradb/pysradb/sraweb.py", line 22, in _order_first
    return df[columns].drop_duplicates()
  File "/home/dibya/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 2806, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/home/dibya/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1552, in _get_listlike_indexer
    keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
  File "/home/dibya/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1645, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['experiment_alias'] not in index"
dibyaaaaax commented 4 years ago

In the function, while merging gsm_df and srr_df, pandas adds default suffixes _x and _y to the column name experiment_alias. Trying to read a column called experiment_alias in the merged data-frame throws an error because the data-frame now contains two columns called experiment_alias_x and experiment_alias_y instead of experiment_alias.

If I am not wrong about the issue, simply adding one of the suffixes to the column name experiment_alias would solve this.

saketkc commented 4 years ago

Great find! I will look into this over the weekend.