Closed grbergeron closed 5 months ago
Hey..
it seams your are running CurveCurator in TMT-Peptide mode based on MaxQuant output. Here CurveCurator expects the evidence.txt file as input. This input file should have a column called 'Modified sequence' (case sensitive). However CurveCurator cannot find this specific column name in the file your provided. That is why it says KeyError: 'Modified sequence'
May I ask what type of data you have? I suspect that you have different data than TMT-Peptide searched with MQ.
Best Flo
Hi Flo, Thank you for your prompt response!!
I was trying to test the system by using the data provided in the paper. That's when I came across this error.
I am working with Spectronaut DIA data and was hoping to figure out a way to make it compatible with CurveCurator. Any suggestions on how I can accomplish this?
Which data-set toml combination did you try to reproduce exactly? The example files on GitHub should all work. I will try to reproduce your error from the example files.
Currently, CurveCurator does not have a built-in Spectonaut parser. We only have a DIANN parser at the moment. But I am generally interested in providing one in a future release. I will try to find a few DIA files and push them through Spectonaut to see the file format that they output for protein and peptide data.
In the meantime, you can always use the generic data upload. For this you need to specify in the TOML file:
measurement_type= 'OTHER'
data_type = 'OTHER'
search_engine = 'OTHER'
Your input_data.txt file then must have the following structure, where xxx is your MS intensity. The 1 to N is the experiment name you provide in the toml file. | Name | Raw 1 | ... | Raw N |
---|---|---|---|---|
Protein_1 | xxxx | ... | xxxx | |
Protein_2 | xxxx | ... | xxxx | |
Protein_3 | xxxx | ... | xxxx |
I recommend starting with the minimal parameter toml file and then slowly adding filters / other options until you find a setup that fits your data.
Best Flo
May I ask if it would be okay to chat via email?
sure..
Hi I am unsure of how to interpret this error or what to do to fix it.
Uncaught exception
Traceback (most recent call last): File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\pandas\core\indexes\base.py", line 3805, in get_loc return self._engine.get_loc(casted_key) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas\_libs\hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Modified sequence'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "C:\Users.conda\envs\CurveCuratorEnv\Scripts\CurveCurator.exe__main.py", line 7, in
sys.exit(main())
^^^^^^
File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\curve_curator__main__.py", line 99, in main
data = data_parser.load(config)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\curve_curator\data_parser.py", line 427, in load
df = load_mq_tmt_peptides(path, search_engine_version, unique_cols=unique_cols, sum_cols=raw_cols, first_cols=first_cols, max_cols=max_cols)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\curve_curator\data_parser.py", line 156, in load_mq_tmt_peptides
df['Modified sequence'] = clean_modified_sequence(df['Modified sequence'])
~~^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\pandas\core\frame.py", line 4090, in getitem__
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.conda\envs\CurveCuratorEnv\Lib\site-packages\pandas\core\indexes\base.py", line 3812, in get_loc
raise KeyError(key) from err
KeyError: 'Modified sequence'