Open tobiasko opened 4 months ago
Hi Tobias,
This likely indicates that the .predicted.speclib was generated with protein inference set to 'Genes', while here it is set to protein names, hence the discrepancy in the proteotypicity definition.
Best, Vadim
Ahhhhhhh! So the --pg-level [N]
default is 2 (gene) and this parameter also affects library prediction from a Uniprot FASTA file (GN=XXXX) ? BTW: What happens if a speclib from an external source (e.g. PROSIT) does not contain the proteotypicity information or even misses the protein entry the parent was derived from? Example in .msp format:
Name: MLGNMNVFMAVLGIILFSGFLAAYFSHK/2
MW: 1546.302144302
Comment: Parent=1546.30214430 Collision_energy=30 Mods=0 ModString=MLGNMNVFMAVLGIILFSGFLAAYFSHK///2 iRT=166.03
Num peaks: 40
147.11280823 0.1296 "y1/0.0ppm"
284.17172241 0.3208 "y2/0.0ppm"
245.13182068 0.2179 "b2/0.0ppm"
371.20373535 0.3588 "y3/0.0ppm"
302.15328979 0.0904 "b3/0.0ppm"
518.27215576 0.5050 "y4/0.0ppm"
416.19622803 0.2998 "b4/0.0ppm"
681.33551025 0.4583 "y5/0.0ppm"
547.23669434 0.3665 "b5/0.0ppm"
752.37261963 0.5639 "y6/0.0ppm"
661.27960205 0.3683 "b6/0.0ppm"
823.40972900 0.5475 "y7/0.0ppm"
760.34802246 0.7807 "b7/0.0ppm"
936.49377441 0.3506 "y8/0.0ppm"
907.41644287 0.1586 "b8/0.0ppm"
1083.56213379 0.1840 "y9/0.0ppm"
1038.45690918 0.2239 "b9/0.0ppm"
1140.58361816 0.6261 "y10/0.0ppm"
1109.49401855 0.2785 "b10/0.0ppm"
1227.61572266 0.9031 "y11/0.0ppm"
1208.56250000 0.2053 "b11/0.0ppm"
1374.68408203 1.0000 "y12/0.0ppm"
1321.64648438 0.1896 "b12/0.0ppm"
1487.76818848 0.9154 "y13/0.0ppm"
1378.66796875 0.1396 "b13/0.0ppm"
1600.85217285 0.7807 "y14/0.0ppm"
1491.75207520 0.1016 "b14/0.0ppm"
1713.93627930 0.2664 "y15/0.0ppm"
1604.83618164 0.0607 "b15/0.0ppm"
1770.95776367 0.9069 "y16/0.0ppm"
1717.92016602 0.0450 "b16/0.0ppm"
1884.04187012 0.5717 "y17/0.0ppm"
1864.98864746 0.0218 "b17/0.0ppm"
1983.11022949 0.2061 "y18/0.0ppm"
2054.14746094 0.1089 "y19/0.0ppm"
2185.18774414 0.1035 "y20/0.0ppm"
2332.25634766 0.1948 "y21/0.0ppm"
1166.63171387 0.0697 "y21^2/0.0ppm"
1273.18737793 0.0303 "y23^2/0.0ppm"
1424.23986816 0.0108 "y26^2/0.0ppm"
Should on always add --reannotate --pg-level [N]
if one wants to compare stats on an aggregated level like protein group/gene?
Should on always add --reannotate --pg-level [N] if one wants to compare stats on an aggregated level like protein group/gene?
Makes sense indeed if the library missed protein info.
I used the following commands to run DIA-NN and got the above warning:
How can this happen, since the speclib was genearted by DIA-NN itself starting from a Uniprot Fasta DB?