Each time I do this with the:
GSE70138_Broad_LINCS_Level3_INF_mlr12k_n345976x12328_2017-03-06.gctx.gz
I see an error:
some of the ids being used to subset the data are not present in the metadata for the file being parsed - mimatch_ids: {'neratinib'}
Traceback (most recent call last):
File "", line 1, in
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse.py, line 65, in parse
out = parse_gctx.parse(file_path, convert_neg_666=convert_neg_666,
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 107, in parse
(sorted_ridx, sorted_cidx) = check_and_order_id_inputs(rid, ridx, cid, cidx, row_meta, col_meta)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 146, in check_and_order_id_inputs
col_ids = check_and_convert_ids(col_type, col_ids, col_meta_df)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 179, in check_and_convert_ids
check_id_validity(id_list, meta_df)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 195, in check_id_validity
raise Exception("parse_gctx check_id_validity " + msg)
Exception: parse_gctx check_id_validity some of the ids being used to subset the data are not present in themetadata for the file being parsed - mismatch_ids: {'neratinib'}
I am trying to parse the:
1-GSE70138_Broad_LINCS_Level3_INF_mlr12k_n345976x12328_2017-03-06.gctx.gz 2-GSE70138_Broad_LINCS_Level3_INF_mlr12k_n78980x22268_2015-06-30.gct.gz 3-GSE70138_Broad_LINCS_Level4_ZSPCINF_mlr12k_n113012x22268_2015-12-31.gct.gz
files with:
1-GSE70138_Broad_LINCS_sig_info_2017-03-06.txt.gz or 2-GSE70138_Broad_LINCS_inst_info_2017-03-06.txt.gz
metadata files. I am trying to make a subset of files to make the process possible and easy to handle.
Each time I do this with the: GSE70138_Broad_LINCS_Level3_INF_mlr12k_n345976x12328_2017-03-06.gctx.gz I see an error:
some of the ids being used to subset the data are not present in the metadata for the file being parsed - mimatch_ids: {'neratinib'} Traceback (most recent call last): File "", line 1, in
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse.py, line 65, in parse
out = parse_gctx.parse(file_path, convert_neg_666=convert_neg_666,
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 107, in parse
(sorted_ridx, sorted_cidx) = check_and_order_id_inputs(rid, ridx, cid, cidx, row_meta, col_meta)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 146, in check_and_order_id_inputs
col_ids = check_and_convert_ids(col_type, col_ids, col_meta_df)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 179, in check_and_convert_ids
check_id_validity(id_list, meta_df)
File "/home/sysmedicine/anaconda3/envs/my_conda/lib/python3.8/site-packages/cmapPy/pandasGEXpress/parse_gcx.py", line 195, in check_id_validity
raise Exception("parse_gctx check_id_validity " + msg)
Exception: parse_gctx check_id_validity some of the ids being used to subset the data are not present in themetadata for the file being parsed - mismatch_ids: {'neratinib'}
How can I fix this problem???