Describe the bug
The most recent version of directLFQ fails with IndexError: list index out of range during the alphaDIA testcase.
To Reproduce
Steps to reproduce the behavior:
run the test case test_output_transform() in alphadia/tests/unit_tests/test_outputtransform.py
Expected behavior
A clear and concise description of what you expected to happen.
Logs
================================================================================================= test session starts =================================================================================================
platform darwin -- Python 3.9.18, pytest-7.4.3, pluggy-1.3.0
rootdir: /Users/georgwallmann/Documents/git/alphadia
collected 59 items / 5 deselected / 54 selected
tests/unit_tests/test_calibration.py .... [ 7%]
tests/unit_tests/test_data.py .. [ 11%]
tests/unit_tests/test_fdr.py ..... [ 20%]
tests/unit_tests/test_fragcomp.py ... [ 25%]
tests/unit_tests/test_grouping.py ......... [ 42%]
tests/unit_tests/test_libtransform.py . [ 44%]
tests/unit_tests/test_numba.py .... [ 51%]
tests/unit_tests/test_outputtransform.py F [ 53%]
tests/unit_tests/test_plexscoring.py . [ 55%]
tests/unit_tests/test_plotting.py .. [ 59%]
tests/unit_tests/test_quadrupole.py ... [ 64%]
tests/unit_tests/test_reporting.py ...... [ 75%]
tests/unit_tests/test_utils.py .... [ 83%]
tests/unit_tests/test_workflow.py ......... [100%]
====================================================================================================== FAILURES =======================================================================================================
________________________________________________________________________________________________ test_output_transform ________________________________________________________________________________________________
def test_output_transform():
run_columns = ["run_0", "run_1", "run_2"]
config = {
"general": {
"thread_count": 8,
},
"fdr": {
"fdr": 0.01,
"inference_strategy": "heuristic",
"group_level": "proteins",
"keep_decoys": False,
},
"search_output": {
"min_k_fragments": 3,
"min_correlation": 0.25,
"num_samples_quadratic": 50,
"min_nonnan": 1,
"normalize_lfq": True,
"peptide_level_lfq": False,
"precursor_level_lfq": False,
},
}
temp_folder = os.path.join(tempfile.gettempdir(), "alphadia")
os.makedirs(temp_folder, exist_ok=True)
progress_folder = os.path.join(temp_folder, "progress")
os.makedirs(progress_folder, exist_ok=True)
# setup raw folders
raw_folders = [os.path.join(progress_folder, run) for run in run_columns]
psm_base_df = _mock_precursor_df(n_precursor=100)
fragment_base_df = _mock_fragment_df(n_precursor=200)
for raw_folder in raw_folders:
os.makedirs(raw_folder, exist_ok=True)
psm_df = psm_base_df.sample(50)
psm_df["run"] = os.path.basename(raw_folder)
frag_df = fragment_base_df[
fragment_base_df["precursor_idx"].isin(psm_df["precursor_idx"])
]
frag_df.to_csv(os.path.join(raw_folder, "frag.tsv"), sep="\t", index=False)
psm_df.to_csv(os.path.join(raw_folder, "psm.tsv"), sep="\t", index=False)
output = outputtransform.SearchPlanOutput(config, temp_folder)
_ = output.build_precursor_table(raw_folders, save=True)
_ = output.build_stat_df(raw_folders, save=True)
> _ = output.build_lfq_tables(raw_folders, save=True)
tests/unit_tests/test_outputtransform.py:169:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
alphadia/outputtransform.py:645: in build_lfq_tables
lfq_df = qb.lfq(
alphadia/outputtransform.py:276: in lfq
protein_df, _ = lfqprot_estimation.estimate_protein_intensities(
../../../miniconda3/envs/alpha/lib/python3.9/site-packages/directlfq/protein_intensity_estimation.py:37: in estimate_protein_intensities
ion_df = get_ion_intensity_dataframe_from_list_of_shifted_peptides(list_of_tuple_w_protein_profiles_and_shifted_peptides, allprots)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
list_of_tuple_w_protein_profiles_and_shifted_peptides = [(array([9.24812545, 9.24812545, nan]), 0 1 2
pg ion ...6417 1.6417 1.6417
695990860382217 1.6417 1.6417 1.6417
695995155349513 1.6417 1.6417 1.6417), ...]
allprots = ['EPROT', 'VPROT', 'ZPROT', 'LPROT', 'FPROT', 'SPROT', ...]
def get_ion_intensity_dataframe_from_list_of_shifted_peptides(list_of_tuple_w_protein_profiles_and_shifted_peptides, allprots):
ion_names = []
ion_vals = []
protein_names = []
column_names = list_of_tuple_w_protein_profiles_and_shifted_peptides[0][1].columns.tolist()
for idx in range(len(list_of_tuple_w_protein_profiles_and_shifted_peptides)):
> protein_name = allprots[idx]
E IndexError: list index out of range
../../../miniconda3/envs/alpha/lib/python3.9/site-packages/directlfq/protein_intensity_estimation.py:206: IndexError
------------------------------------------------------------------------------------------------ Captured stdout call -------------------------------------------------------------------------------------------------
2024-01-24 12:08:13> Performing protein grouping and FDR
2024-01-24 12:08:13> Building output for run_0
2024-01-24 12:08:13> Building output for run_1
2024-01-24 12:08:13> Building output for run_2
2024-01-24 12:08:13> Building combined output
2024-01-24 12:08:13> Performing protein inference
2024-01-24 12:08:13> Inference strategy: heuristic. Using maximum parsimony with grouping for protein inference
2024-01-24 12:08:13> Performing protein FDR
2024-01-24 12:08:13> Test AUC: 1.000
2024-01-24 12:08:13> Train AUC: 1.000
2024-01-24 12:08:13> AUC difference: 0.00%
2024-01-24 12:08:13> ================ Protein FDR =================
2024-01-24 12:08:13> Unique protein groups in output
2024-01-24 12:08:13> 1% protein FDR: 24
2024-01-24 12:08:13>
2024-01-24 12:08:13> Unique precursor in output
2024-01-24 12:08:13> 1% protein FDR: 42
2024-01-24 12:08:13> ================================================
2024-01-24 12:08:13> Writing precursor output to disk
2024-01-24 12:08:13> Building search statistics
2024-01-24 12:08:13> Reading precursors.tsv file
2024-01-24 12:08:13> Writing stat output to disk
2024-01-24 12:08:13> Performing label free quantification
2024-01-24 12:08:13> Reading precursors.tsv file
2024-01-24 12:08:13> Accumulating fragment data
2024-01-24 12:08:13> reading frag file for run_0
2024-01-24 12:08:13> reading frag file for run_1
2024-01-24 12:08:13> reading frag file for run_2
2024-01-24 12:08:13> Performing label free quantification on the pg level
2024-01-24 12:08:13> Filtering fragments by quality
2024-01-24 12:08:13> Performing label-free quantification using directLFQ
2024-01-24 12:08:13> to few values for normalization without missing values. Including missing values
2024-01-24 12:08:13> 24 lfq-groups total
2024-01-24 12:08:13> using 8 processes
2024-01-24 12:08:13> lfq-object 0
-------------------------------------------------------------------------------------------------- Captured log call --------------------------------------------------------------------------------------------------
PROGRESS root:outputtransform.py:419 Performing protein grouping and FDR
INFO root:outputtransform.py:427 Building output for run_0
INFO root:outputtransform.py:427 Building output for run_1
INFO root:outputtransform.py:427 Building output for run_2
INFO root:outputtransform.py:446 Building combined output
INFO root:outputtransform.py:456 Performing protein inference
INFO root:outputtransform.py:488 Inference strategy: heuristic. Using maximum parsimony with grouping for protein inference
INFO root:outputtransform.py:501 Performing protein FDR
INFO root:fdr.py:355 Test AUC: 1.000
INFO root:fdr.py:356 Train AUC: 1.000
INFO root:fdr.py:359 AUC difference: 0.00%
PROGRESS root:outputtransform.py:508 ================ Protein FDR =================
PROGRESS root:outputtransform.py:511 Unique protein groups in output
PROGRESS root:outputtransform.py:512 1% protein FDR: 24
PROGRESS root:outputtransform.py:513
PROGRESS root:outputtransform.py:514 Unique precursor in output
PROGRESS root:outputtransform.py:515 1% protein FDR: 42
PROGRESS root:outputtransform.py:516 ================================================
INFO root:outputtransform.py:524 Writing precursor output to disk
PROGRESS root:outputtransform.py:560 Building search statistics
INFO root:outputtransform.py:390 Reading precursors.tsv file
INFO root:outputtransform.py:576 Writing stat output to disk
PROGRESS root:outputtransform.py:607 Performing label free quantification
INFO root:outputtransform.py:390 Reading precursors.tsv file
INFO root:outputtransform.py:123 Accumulating fragment data
INFO root:outputtransform.py:58 reading frag file for run_0
INFO root:outputtransform.py:58 reading frag file for run_1
INFO root:outputtransform.py:58 reading frag file for run_2
PROGRESS root:outputtransform.py:633 Performing label free quantification on the pg level
INFO root:outputtransform.py:208 Filtering fragments by quality
INFO root:outputtransform.py:255 Performing label-free quantification using directLFQ
INFO directlfq.normalization:normalization.py:239 to few values for normalization without missing values. Including missing values
INFO directlfq.protein_intensity_estimation:protein_intensity_estimation.py:32 24 lfq-groups total
INFO directlfq.protein_intensity_estimation:protein_intensity_estimation.py:107 using 8 processes
================================================================================================== warnings summary ===================================================================================================
tests/unit_tests/test_fragcomp.py::test_fragment_competition
/Users/georgwallmann/Documents/git/alphadia/alphadia/fragcomp.py:189: FutureWarning: The provided callable <built-in function min> is currently using SeriesGroupBy.min. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "min" instead.
index_df = frag_df.groupby("_candidate_idx", as_index=False).agg(
tests/unit_tests/test_fragcomp.py::test_fragment_competition
/Users/georgwallmann/Documents/git/alphadia/alphadia/fragcomp.py:189: FutureWarning: The provided callable <built-in function max> is currently using SeriesGroupBy.max. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "max" instead.
index_df = frag_df.groupby("_candidate_idx", as_index=False).agg(
tests/unit_tests/test_fragcomp.py::test_fragment_competition
/Users/georgwallmann/Documents/git/alphadia/alphadia/fragcomp.py:247: FutureWarning: The provided callable <built-in function min> is currently using SeriesGroupBy.min. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "min" instead.
index_df = psm_df.groupby("window_idx", as_index=False).agg(
tests/unit_tests/test_fragcomp.py::test_fragment_competition
/Users/georgwallmann/Documents/git/alphadia/alphadia/fragcomp.py:247: FutureWarning: The provided callable <built-in function max> is currently using SeriesGroupBy.max. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "max" instead.
index_df = psm_df.groupby("window_idx", as_index=False).agg(
tests/unit_tests/test_outputtransform.py::test_output_transform
/Users/georgwallmann/Documents/git/alphadia/alphadia/outputtransform.py:458: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
psm_df["mods"].fillna("", inplace=True)
tests/unit_tests/test_outputtransform.py::test_output_transform
/Users/georgwallmann/Documents/git/alphadia/alphadia/outputtransform.py:461: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
psm_df["mod_sites"].fillna("", inplace=True)
tests/unit_tests/test_outputtransform.py::test_output_transform
/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.
warnings.warn(
tests/unit_tests/test_outputtransform.py::test_output_transform
/Users/georgwallmann/Documents/git/alphadia/alphadia/fdr.py:403: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
plt.show()
tests/unit_tests/test_plotting.py::test_plot_cycle
/Users/georgwallmann/Documents/git/alphadia/alphadia/plotting/cycle.py:189: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
cmap = cm.get_cmap(cmap_name)
tests/unit_tests/test_plotting.py::test_plot_cycle
/Users/georgwallmann/Documents/git/alphadia/alphadia/plotting/cycle.py:46: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
cmap = cm.get_cmap(cmap_name)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================================================== short test summary info ===============================================================================================
FAILED tests/unit_tests/test_outputtransform.py::test_output_transform - IndexError: list index out of range
============================================================================== 1 failed, 53 passed, 5 deselected, 10 warnings in 34.87s ===============================================================================
Describe the bug The most recent version of directLFQ fails with IndexError: list index out of range during the alphaDIA testcase.
To Reproduce Steps to reproduce the behavior:
test_output_transform()
inalphadia/tests/unit_tests/test_outputtransform.py
Expected behavior A clear and concise description of what you expected to happen.
Logs