MannLabs / directlfq

Fast and accurate label-free quantification for small and very large numbers of proteomes
https://doi.org/10.1101/2023.02.17.528962
Apache License 2.0
37 stars 4 forks source link

Issue warning if quant_id is not unique key #35

Open GeorgWa opened 5 months ago

GeorgWa commented 5 months ago

Describe the bug If we pass a dataframe with duplicates in the quant_id column to lfqnorm.NormalizationManagerSamplesOnSelectedProteins() it results in a rather strange numba error.

A more informative error message or a check on the column might be usefull.

Logs

../alphadia/outputtransform.py:705: in build_lfq_tables
    lfq_df = qb.lfq(
../alphadia/outputtransform.py:284: in lfq
    protein_df, _ = lfqprot_estimation.estimate_protein_intensities(
/usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/directlfq/protein_intensity_estimation.py:44: in estimate_protein_intensities
    list_of_tuple_w_protein_profiles_and_shifted_peptides = get_list_of_tuple_w_protein_profiles_and_shifted_peptides(normed_df, num_samples_quadratic, min_nonan, num_cores)
/usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/directlfq/protein_intensity_estimation.py:60: in get_list_of_tuple_w_protein_profiles_and_shifted_peptides
    list_of_tuple_w_protein_profiles_and_shifted_peptides = get_list_with_multiprocessing(input_specification_tuplelist_idx__df__num_samples_quadratic__min_nonan, num_cores)
/usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/directlfq/protein_intensity_estimation.py:107: in get_list_with_multiprocessing
    list_of_tuple_w_protein_profiles_and_shifted_peptides = pool.starmap(calculate_peptide_and_protein_intensities, input_specification_tuplelist_idx__df__num_samples_quadratic__min_nonan)
/usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/multiprocess/pool.py:372: in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <multiprocess.pool.MapResult object at 0x7fe2f7857a90>, timeout = None

    def get(self, timeout=None):
        self.wait(timeout)
        if not self.ready():
            raise TimeoutError
        if self._success:
            return self._value
        else:
>           raise self._value
E           numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E           No implementation of function Function(<built-in function iadd>) found for signature:
E            
E            >>> iadd(Literal[int](0), array(bool, 1d, A))
E            
E           There are 18 candidate implementations:
E             - Of which 16 did not match due to:
E             Overload of function 'iadd': File: <numerous>: Line N/A.
E               With argument(s): '(int64, array(bool, 1d, A))':
E              No match.
E             - Of which 2 did not match due to:
E             Operator Overload in function 'iadd': File: unknown: Line unknown.
E               With argument(s): '(int64, array(bool, 1d, A))':
E              No match for registered cases:
E               * (int64, int64) -> int64
E               * (int64, uint64) -> int64
E               * (uint64, int64) -> int64
E               * (uint64, uint64) -> uint64
E               * (float32, float32) -> float32
E               * (float64, float64) -> float64
E               * (complex64, complex64) -> complex64
E               * (complex128, complex128) -> complex128
E           
E           During: typing of intrinsic-call at /usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/directlfq/normalization.py (304)
E           
E           File "../../../../../../usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/directlfq/normalization.py", line 304:
E               def _get_num_nas_in_row(row):
E                   <source elided>
E                   for is_nan in isnans:
E                       sum+=is_nan
E                       ^

/usr/local/miniconda/envs/alphadia/lib/python3.9/site-packages/multiprocess/pool.py:771: TypingError
ammarcsj commented 5 months ago

Thanks! I added a check and an error message. Currently in develop, will be in the next release