cmap / cmapPy

Assorted tools for interacting with .gct, .gctx files and other Connectivity Map (Broad Institute) data/tools
https://clue.io/cmapPy/index.html
BSD 3-Clause "New" or "Revised" License
126 stars 76 forks source link

Exception("parse_gctx check_id_validity " + msg) #2

Closed kai0511 closed 7 years ago

kai0511 commented 7 years ago

When I am extracting data from GSE92742_Broad_LINCS_Level5_COMPZ.MODZ_n473647x12328.gctx using cmapPy, the exception occurs. Could you help me this exception? I think it shouldn't happen.

I tried to see what the data looks like by following the code in the tutorial cmapPy_pandasGEXpress_tutorial.ipynb.

Here is the code: vorinostat_only_gctoo = parse("GSE70138_Broad_LINCS_Level5_COMPZ_n118050x12328_2017-03-06.gctx", cid=vorinostat_ids)

The following is the detail msg of the exception:

No handlers could be found for logger "cmap_logger" Traceback (most recent call last): File "", line 1, in File "/exeh/exe3/zhaok/.local/lib/python2.7/site-packages/cmapPy/pandasGEXpress/parse.py", line 51, in parse curr = parse_gctx.parse(file_path, convert_neg_666, rid, cid, ridx, cidx, meta_only, make_multiindex) File "/exeh/exe3/zhaok/.local/lib/python2.7/site-packages/cmapPy/pandasGEXpress/parse_gctx.py", line 65, in parse (sorted_ridx, sorted_cidx) = check_and_order_id_inputs(rid, ridx, cid, cidx, row_meta, col_meta) File "/exeh/exe3/zhaok/.local/lib/python2.7/site-packages/cmapPy/pandasGEXpress/parse_gctx.py", line 107, in check_and_order_id_inputs col_ids = check_and_convert_ids(col_type, col_ids, col_meta_df) File "/exeh/exe3/zhaok/.local/lib/python2.7/site-packages/cmapPy/pandasGEXpress/parse_gctx.py", line 140, in check_and_convert_ids check_id_validity(id_list, meta_df) File "/exeh/exe3/zhaok/.local/lib/python2.7/site-packages/cmapPy/pandasGEXpress/parse_gctx.py", line 153, in check_id_validity raise Exception("parse_gctx check_id_validity " + msg) Exception: parse_gctx check_id_validity some of the ids being used to subset the data are not present in the metadata for the file being parsed - mismatch_ids: set(['LPROT002_MCF7_6H:P10', 'LJP008_HCC515_24H:A03', 'LPROT002_MCF7_6H:P12', 'LJP009_ASC_24H:A03',

kai0511 commented 7 years ago

I figure out this problem after looking at the error log carefully. I din't use correct version of signature data set.

What I should use is this one.