TalusBio / diadem

Apache License 2.0
8 stars 1 forks source link

[nitpick] mokapot output adds an index column #18

Open jspaezp opened 1 year ago

jspaezp commented 1 year ago

when exporting the data from mokapot it adds an index column, this should be removed.

┌─────────────────────────────────────┬───────────┬─────────────────────────────────────┬─────────────────────────────────────┬─────┬─────────────────┬─────────────┬──────────┬───────────────────┐
│ peptide                             ┆ is_target ┆ peak_id                             ┆ filename                            ┆ ... ┆ mokapot q-value ┆ mokapot PEP ┆ proteins ┆ __index_level_0__ │
│ ---                                 ┆ ---       ┆ ---                                 ┆ ---                                 ┆     ┆ ---             ┆ ---         ┆ ---      ┆ ---               │
│ str                                 ┆ bool      ┆ str                                 ┆ str                                 ┆     ┆ f64             ┆ f64         ┆ str      ┆ i64               │
╞═════════════════════════════════════╪═══════════╪═════════════════════════════════════╪═════════════════════════════════════╪═════╪═════════════════╪═════════════╪══════════╪═══════════════════╡
│ DATNVGDEGGFAPNILENK/2               ┆ true      ┆ 950.000000, 1000.000000::1335::1    ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.000051        ┆ 5.8025e-14  ┆ P06733   ┆ 0                 │
│ SSSNLAVSGHPFYQVSATR/2               ┆ true      ┆ 1000.000000, 1050.000000::1373::... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.000051        ┆ 9.0020e-14  ┆ P51587   ┆ 1                 │
│ VTAEDLHLEKETAFQR/3                  ┆ true      ┆ 600.000000, 650.000000::843::531... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.000051        ┆ 9.4989e-14  ┆ A6NNC1   ┆ 2                 │
│ LHNHGTVDWNSKRR/3                    ┆ true      ┆ 550.000000, 600.000000::752::614... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.000051        ┆ 9.5774e-14  ┆ A8MVM7   ┆ 3                 │
│ ...                                 ┆ ...       ┆ ...                                 ┆ ...                                 ┆ ... ┆ ...             ┆ ...         ┆ ...      ┆ ...               │
│ GLAPDLPEDLYHLIK/3                   ┆ true      ┆ 550.000000, 600.000000::1862::21... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.002868        ┆ 1.0         ┆ P62277   ┆ 19577             │
│ <[UNIMOD:4]@C>AGQVDAHDCEALGWGSEA... ┆ true      ┆ 850.000000, 900.000000::1339::39... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.002868        ┆ 1.0         ┆          ┆ 19578             │
│ HRLASFK/2                           ┆ true      ┆ 400.000000, 450.000000::1130::14... ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.112522        ┆ 1.0         ┆ Q0P6D6   ┆ 21720             │
│ KASLLSAK/2                          ┆ true      ┆ 400.000000, 450.000000::973::882    ┆ Hela_25ng_22min_6x3_short_S2-A4_... ┆ ... ┆ 0.651951        ┆ 1.0         ┆ O94885   ┆ 32254             │
└─────────────────────────────────────┴───────────┴─────────────────────────────────────┴─────────────────────────────────────┴─────┴─────────────────┴─────────────┴──────────┴───────────────────┘
jspaezp commented 1 year ago

related, I think we should keep the decoys as well ... maybe output them in another table

jspaezp commented 1 year ago

related ... cols are ['peptide', 'is_target', 'peak_id', 'filename', 'target_pair', 'mokapot score', 'mokapot q-value', 'mokapot PEP', 'proteins', '__index_level_0__'] we should also pass the RT