levitsky / pyteomics

Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
http://pyteomics.readthedocs.io
Apache License 2.0
105 stars 34 forks source link

Polarity missing from mzMLb parser #113

Closed sorenwacker closed 1 year ago

sorenwacker commented 1 year ago

Hi, I am missing the polarity in the mzMLb parser output. I using the following code:

with mzmlb.MzMLb(fn) as ms_data:
    data = [x for x in ms_data]

print(data[0])

and the output looks like this:

{'index': 0, 'defaultArrayLength': 393, 'id': 'controllerType=0 controllerNumber=1 scan=1', 'scanList': {'count': 1, 'scan': [{'scanWindowList': {'count': 1, 'scanWindow': [{'scan window lower limit': 85.0, 'scan window upper limit': 1275.0}]}, 'filter string': 'FTMS + p ESI Full ms [85.0000-1275.0000]', 'preset scan configuration': 1.0, 'ion injection time': 50.000000745058, 'scan start time': 0.0033572507}], 'no combination': ''}, 'ms level': 1, 'base peak m/z': 453.3438439, 'base peak intensity': 4834081.0, 'total ion current': 26539958.0, 'lowest observed m/z': 91.964447021484, 'highest observed m/z': 1049.737182617188, 'positive scan': '', 'centroid spectrum': '', 'MS1 spectrum': '', 'count': 2, 'm/z array': array([  91.96444702,   98.48381042,  100.11230469,  101.09159851,
        102.09130859,  102.12753296,  103.82502747,  103.95560455,
        104.04561615,  104.0708847 ,  105.95396423,  110.06028748,
        111.04418945,  112.05068207,  112.06158447,  114.09138489,
        116.07071686,  116.1071167 ,  118.08629608,  121.96633148,
        123.05532837,  123.96440887,  128.07070923,  128.14253235,
        128.94871521,  128.95352173,  130.08644104,  130.12272644,
        130.15908813,  130.96650696,  131.11798096,  132.10197449,
        132.96498108,  136.0213623 ,  136.06253052,  137.07151794,
        139.12322998,  139.98799133,  142.08662415,  142.1227417 ,
        143.08183289,  143.15457153,  144.06536865,  144.13844299,
        144.92749023,  144.98225403,  145.09689331,  145.14179993,
        145.98565674,  146.98046875,  147.11265564,  147.98370361,
        148.03938293,  149.01365662,  149.02337646,  149.02944946,
        150.02680969,  151.03256226,  151.09658813,  152.10707092,
        153.12733459,  154.08651733,  154.0975647 ,  156.10186768,
        156.11309814,  156.13833618,  158.08183289,  158.15428162,
        158.16511536,  159.14942932,  159.15711975,  160.09724426,
        161.09660339,  163.07554626,  163.13302612,  164.10694885,
        164.9705658 ,  166.05004883,  166.97509766,  167.03382874,
        167.10697937,  168.94624329,  169.1224823 ,  170.11767578,
        170.15449524,  170.55682373,  171.11299133,  171.14950562,
        172.09716797,  172.11613464,  172.13337708,  172.1697998 ,
        173.12858582,  173.17320251,  174.11265564,  177.05467224,
        177.09104919,  177.12739563,  178.12290955,  178.13096619,
        178.98640442,  179.10673523,  180.98167419,  181.94981384,
        181.99659729,  182.09272766,  182.16542053,  183.10162354,
        184.18180847,  184.88739014,  184.9201355 ,  185.60681152,
        186.12341309,  186.18513489,  186.95266724,  187.14411926,
        188.03184509,  188.12826538,  188.94807434,  190.05026245,
        190.10751343,  192.08624268,  192.13848877,  193.00189209,
        193.09515381,  193.14227295,  194.08460999,  194.11772156,
        194.15429688,  194.99763489,  195.06538391,  195.10179138,
        195.12260437,  196.10516357,  197.15379333,  198.12809753,
        199.16931152,  200.12892151,  200.20095825,  200.23744202,
        200.96839905,  201.2040863 ,  202.21656799,  202.92294312,
        202.96383667,  203.12820435,  205.02560425,  205.08644104,
        207.01789856,  208.1700592 ,  208.93479919,  209.01283264,
        209.15512085,  209.16474915,  210.01405334,  210.02859497,
        210.11413574,  210.12390137,  210.9304657 ,  211.09873962,
        211.1073761 ,  211.13293457,  211.14459229,  211.16984558,
        212.02398682,  212.03314209,  212.09527588,  212.16485596,
        213.09889221,  213.15975952,  214.09115601,  214.10006714,
        214.17988586,  214.2167511 ,  214.98423767,  215.07647705,
        215.12937927,  215.17460632,  216.15966797,  216.17086792,
        217.10783386,  217.18032837,  218.13806152,  219.1734314 ,
        222.1126709 ,  222.9499054 ,  223.09680176,  224.12864685,
        224.1645813 ,  224.88059998,  225.19607544,  226.18026733,
        227.08561707,  227.17544556,  227.27041626,  227.67697144,
        228.12298584,  228.17906189,  229.18156433,  230.13999939,
        230.17407227,  230.24787903,  230.91630554,  231.1235199 ,
        231.18617249,  232.01115417,  232.19017029,  232.9115448 ,
        233.91497803,  234.0771637 ,  234.18554688,  234.91833496,
        235.08065796,  235.91320801,  236.07264709,  236.16438293,
        237.91127014,  239.12753296,  239.1391449 ,  240.14254761,
        241.1545105 ,  241.21580505,  242.15750122,  242.17689514,
        243.13356018,  243.17086792,  243.20675659,  244.15545654,
        244.26393127,  245.20173645,  246.13409424,  246.17027283,
        246.20443726,  246.89016724,  247.17897034,  248.16168213,
        249.18493652,  249.20518494,  250.05107117,  250.11871338,
        250.18849182,  250.89222717,  251.12263489,  251.16409302,
        252.92826843,  254.24761963,  255.15969849,  255.23117065,
        256.05953979,  256.16558838,  257.15176392,  261.12072754,
        262.12918091,  262.14328003,  265.17984009,  268.15380859,
        268.19049072,  268.89202881,  269.24716187,  270.90921021,
        271.22711182,  272.2336731 ,  272.25866699,  273.2359314 ,
        274.91079712,  274.93572998,  276.19604492,  277.20053101,
        278.2479248 ,  279.09378052,  279.15905762,  279.23181152,
        280.16366577,  281.24728394,  283.26345825,  284.19699097,
        284.22137451,  285.20071411,  287.22210693,  288.25308228,
        291.86038208,  294.20684814,  296.05145264,  296.91769409,
        299.25753784,  300.20379639,  300.26461792,  300.29089355,
        301.26837158,  304.17575073,  304.24862671,  304.30020142,
        309.22561646,  310.19934082,  310.24243164,  312.21826172,
        312.25314331,  313.17596436,  313.28482056,  314.26919556,
        314.2862854 ,  315.30081177,  316.28485107,  318.90078735,
        322.84710693,  324.21673584,  329.00592041,  332.18688965,
        332.27978516,  336.21099854,  336.87969971,  339.26464844,
        340.04244995,  340.25946045,  340.76104736,  341.26266479,
        341.76403809,  342.26663208,  343.30865479,  344.2265625 ,
        345.27429199,  348.77453613,  349.26464844,  351.24914551,
        351.75262451,  352.03048706,  354.2409668 ,  356.27966309,
        357.28268433,  358.28607178,  360.18197632,  362.02325439,
        362.24169922,  365.20654297,  371.26376343,  372.24899292,
        373.2336731 ,  379.23803711,  380.90090942,  383.21731567,
        384.21990967,  387.24923706,  391.28433228,  392.26193237,
        392.28659058,  393.28945923,  393.31152344,  397.28356934,
        398.88037109,  400.24435425,  401.24572754,  402.88336182,
        410.32952881,  413.28912354,  419.31491089,  420.32220459,
        424.02511597,  428.27587891,  436.34091187,  437.34564209,
        442.00537109,  453.34384155,  453.8454895 ,  454.3470459 ,
        454.8480835 ,  455.34988403,  456.35092163,  461.85845947,
        462.36126709,  469.33932495,  470.37054443,  471.37371826,
        472.37667847,  473.37887573,  475.32550049,  482.40563965,
        497.33847046,  498.34460449,  498.40228271,  499.40551758,
        526.43310547,  526.88153076,  566.43060303,  679.5111084 ,
        680.51470947,  681.51940918,  696.53643799,  697.53869629,
        697.6081543 ,  701.49133301,  824.35168457,  953.55450439,
       1049.73718262]), 'intensity array': array([3.08643994e+03, 2.46264893e+03, 2.72033750e+04, 2.74241162e+03,
       4.18943359e+03, 3.94366064e+03, 2.38690576e+03, 2.93491543e+04,
       4.03352954e+03, 3.70901318e+03, 5.46660596e+03, 2.72958374e+03,
       3.11966089e+03, 1.06019551e+04, 2.96716724e+03, 1.64737969e+04,
       5.31058320e+04, 3.49217651e+03, 1.78674180e+04, 2.23583574e+04,
       3.84103857e+03, 5.34135840e+03, 4.61114844e+03, 2.35314014e+03,
       5.14221826e+03, 1.52673312e+05, 3.53059937e+03, 4.65196631e+03,
       1.69600137e+04, 1.19285117e+04, 1.69284062e+04, 1.08998994e+04,
       4.27840625e+03, 5.80239795e+03, 2.81877051e+03, 4.00246387e+03,
       4.34100049e+03, 4.51127344e+03, 4.69465723e+03, 3.04074292e+03,
       2.93289844e+03, 3.75165259e+03, 3.24270947e+03, 7.71020625e+04,
       2.92360293e+04, 3.85791750e+05, 2.80208325e+03, 4.18535986e+03,
       1.49325752e+04, 1.75045781e+05, 4.55254443e+03, 1.11394746e+04,
       3.73961060e+03, 1.30034102e+04, 1.78270938e+05, 5.82286279e+03,
       1.10546543e+04, 1.40072832e+04, 5.03309082e+03, 2.66921387e+03,
       3.15045190e+03, 4.68517334e+03, 5.30743457e+03, 5.02005273e+03,
       5.61103027e+03, 4.79409717e+03, 3.09188940e+03, 2.54386895e+04,
       2.59175342e+03, 4.33376270e+03, 2.59613354e+03, 5.02313525e+03,
       1.86672305e+04, 3.65459399e+03, 4.17647412e+03, 4.63909131e+03,
       1.16460742e+04, 3.64648789e+04, 1.45964307e+04, 4.43941895e+03,
       3.89881885e+03, 3.54748555e+04, 5.53771240e+03, 9.86927637e+03,
       4.00328149e+03, 2.83784131e+03, 8.38448438e+04, 3.55323359e+04,
       5.42793457e+03, 5.50888184e+03, 1.39486953e+04, 5.06257031e+04,
       3.01800171e+03, 5.71779443e+03, 2.59621973e+03, 4.07175000e+03,
       1.23230586e+04, 2.42848965e+04, 4.74640820e+03, 2.92856421e+03,
       3.39111094e+04, 1.27965264e+04, 1.24993574e+04, 1.93810879e+04,
       2.73361450e+03, 1.37448281e+04, 2.86588159e+03, 4.69678076e+03,
       2.82942798e+03, 2.37619604e+03, 1.31355127e+04, 2.33956934e+03,
       2.88234863e+03, 6.14399072e+03, 4.16217500e+04, 6.28384180e+03,
       1.46020029e+04, 3.00078320e+03, 1.48680781e+04, 3.57960449e+03,
       1.02213359e+04, 3.14258081e+03, 5.15372773e+04, 2.49283652e+04,
       2.31469766e+04, 1.20778154e+04, 1.66199316e+04, 4.95953633e+04,
       5.60851514e+03, 4.77246826e+03, 1.37250898e+04, 4.83379883e+04,
       1.03092393e+04, 6.14433691e+03, 3.11333716e+03, 3.32092212e+03,
       1.13853984e+04, 3.20222266e+03, 4.96653750e+04, 3.68889453e+03,
       2.79689355e+04, 6.27621875e+03, 2.86240869e+03, 2.84022583e+03,
       1.16316396e+04, 2.87024097e+03, 3.35553735e+03, 9.21796777e+03,
       1.31596748e+04, 5.11849463e+03, 4.61045781e+04, 4.29282812e+03,
       4.56351025e+03, 3.51825312e+04, 5.51752393e+03, 5.94271289e+03,
       2.95988623e+03, 4.45850830e+03, 2.14252109e+04, 2.69751318e+03,
       2.20123633e+04, 2.83936377e+03, 3.10551855e+03, 3.53405396e+03,
       3.34235327e+03, 1.55438320e+04, 8.27916875e+05, 5.20299268e+03,
       6.46707148e+04, 1.01388584e+04, 3.65691680e+04, 5.98906738e+03,
       4.46697021e+03, 5.96577344e+03, 1.96428008e+04, 4.82004883e+03,
       1.49324717e+04, 8.74518066e+03, 3.60177734e+03, 3.41104858e+03,
       2.22256465e+04, 3.37564771e+03, 3.65952612e+03, 3.38363257e+03,
       1.21527793e+04, 1.06393965e+04, 6.04770078e+04, 6.39734277e+03,
       4.45529346e+03, 2.95434692e+03, 6.22482227e+03, 3.16312578e+04,
       5.66283545e+03, 4.06802800e+06, 4.48862500e+03, 2.42950801e+04,
       9.65540039e+03, 5.17064906e+05, 4.34522188e+04, 4.04672578e+04,
       3.58961157e+03, 1.68030332e+04, 4.22729805e+04, 1.62331113e+04,
       2.14611973e+04, 3.13852222e+03, 3.46335254e+03, 1.15544238e+04,
       7.37960781e+04, 1.40391125e+05, 3.31479492e+03, 5.26724414e+04,
       1.11655703e+04, 6.63749609e+04, 5.78321582e+03, 3.35654199e+03,
       1.24319229e+04, 6.56035059e+03, 6.99006484e+04, 5.65097656e+03,
       1.68429277e+04, 3.87433765e+03, 2.99359326e+03, 3.62571045e+03,
       4.35698926e+03, 3.10479199e+03, 1.50695156e+04, 4.50729199e+03,
       4.68995459e+03, 4.81567188e+03, 1.06146016e+04, 5.68873389e+03,
       2.75502686e+03, 4.94391699e+03, 3.27859058e+03, 3.50919824e+03,
       1.59211047e+05, 5.05956592e+03, 1.14453701e+04, 4.62261914e+04,
       3.95171211e+04, 5.87471484e+03, 4.28995508e+03, 3.42045288e+03,
       5.40951904e+03, 3.05078149e+03, 3.02939648e+03, 3.78984790e+03,
       1.98685273e+04, 3.52063721e+03, 1.72552754e+04, 6.09890479e+03,
       3.00940137e+03, 4.00431543e+03, 1.58364199e+04, 5.09720312e+03,
       2.51977754e+04, 3.33994751e+03, 3.84136621e+03, 2.24508535e+04,
       4.30696924e+03, 4.06964844e+04, 5.49942627e+03, 4.20265186e+03,
       2.18472441e+04, 3.70760400e+03, 5.44197266e+04, 4.18894238e+03,
       1.68035371e+04, 2.64659707e+04, 6.70855625e+04, 3.03269824e+03,
       4.03667676e+03, 3.93206812e+03, 5.52983008e+03, 3.98742188e+04,
       2.10007324e+04, 3.13256396e+03, 9.60883203e+03, 1.69085898e+04,
       3.11437573e+03, 3.57538232e+03, 2.53698613e+04, 1.92314434e+04,
       5.23114404e+03, 5.48868701e+03, 2.11570078e+04, 4.64473828e+03,
       6.06976074e+03, 3.55723340e+03, 4.45637744e+03, 1.01935020e+04,
       3.63179956e+03, 3.73431714e+03, 3.90291260e+03, 3.46645093e+03,
       9.64896680e+03, 4.66274463e+03, 1.49472207e+04, 1.23581172e+04,
       3.16178027e+03, 4.99056641e+03, 1.22764014e+04, 2.12166348e+04,
       3.57697485e+03, 1.80048594e+04, 4.23439551e+03, 4.18405078e+03,
       4.28571875e+03, 3.73300952e+03, 1.01944365e+04, 5.69710254e+03,
       1.22791348e+04, 4.50207950e+06, 1.82761775e+06, 3.84491500e+05,
       6.20955156e+04, 1.17986455e+04, 4.18449170e+03, 5.71974658e+03,
       4.64763818e+03, 5.11858740e+03, 1.04246807e+04, 2.57208633e+04,
       1.21798369e+04, 3.53034668e+03, 4.51639014e+03, 1.59336109e+05,
       4.47562656e+04, 3.90541089e+03, 3.85566431e+03, 1.06717637e+04,
       4.03035522e+03, 5.82785059e+03, 4.12029541e+03, 5.91424219e+03,
       1.27252910e+04, 6.41571631e+03, 1.10867744e+04, 4.13909023e+04,
       5.33819531e+03, 4.85026660e+03, 2.22550125e+05, 5.48246631e+03,
       5.28025469e+04, 1.11685176e+04, 1.27671416e+04, 4.51152783e+03,
       5.35385791e+03, 4.87360781e+04, 1.16619199e+04, 4.96894629e+03,
       4.00218799e+03, 1.60179062e+04, 1.76551270e+04, 3.79212500e+03,
       6.12456885e+03, 3.07149004e+04, 2.28532891e+04, 4.46533984e+03,
       4.15900928e+03, 4.83408100e+06, 9.45670625e+04, 1.27504962e+06,
       5.24563525e+03, 1.82740062e+05, 2.36904082e+04, 4.62129639e+03,
       4.49638477e+03, 4.56244336e+03, 7.80218625e+05, 2.14587656e+05,
       3.81560859e+04, 4.09080103e+03, 1.92084688e+04, 5.72966992e+03,
       3.24531133e+04, 3.71876587e+03, 4.81253867e+04, 1.15181133e+04,
       5.91081592e+03, 4.20000732e+03, 5.35310840e+03, 3.45229219e+05,
       1.48760812e+05, 3.62823906e+04, 3.59425547e+04, 1.48421885e+04,
       3.43743652e+03, 3.65723804e+03, 3.34065308e+03, 3.58571851e+03,
       3.25074927e+03], dtype=float32)}

I would expect a field polarity with value + or -, or related, which would be most convenient. Can I find the polarity in the data?

mobiusklein commented 1 year ago

It's only in mzXML that has a polarity. In mzML and by extension mzMLb, this is indicated with the parameter positive scan or negative scan.

It's near the end of the dictionary you're showing here.

sorenwacker commented 1 year ago

Ok, thanks!!