Closed oliveralka closed 3 years ago
Amazing, thank you! This is very helpful. I will work on it today and update the workflow. I am having some issues but I'll give it a try to solve them before asking you :)
Alright so this is what I have so far:
Sirius= SiriusMSFile() argument1= exp argument2= SiriusTemporaryFileSystemObjects.getTmpMsFile() argument3= FeatureMapping_FeatureToMs2Indices() feature_only= True #SiriusAdapterAlgorithm.getFeatureOnly()==True Isotope_iter= 3 #SiriusAdapterAlgorithm.getIsotopePatternIterations() Isotopemasstraceinfo= False CompoundInfo= [] Sirius.store(exp, argument2, argument3, feature_only, True, 3, False, CompoundInfo)
This gives me the following error:
Traceback (most recent call last):
File "
Which I believe comes from argument2, which requires an argument or just a simpler version. I've simply tried to call "siriustest.ms" or something similar but nothing works so far.
Did you figure it out? I will take a look, what pyopenms version are you using?
edit: Could you please provide the example data you are currently using to test the workflow prototype in the repository? Then I can run it, in the current configuration.
The pyopenms version is 2.5.0. Unfortunately I did not figure it out yet but I m having some inconvenient issues with my editor so trying to fix that too. I would expect that a file *.ms would be fine, but somehow it doesn't like that :D
I am using the GermicidinAstandard.mzML from https://drive.google.com/drive/folders/1O0JmZa17oqyzObAjphbXxyHmE9LF6Tkf?usp=sharing here.
Alright, so I think I m starting to get the idea when looking at the cpp script. Thank you so much for the guidance so far. It really helped a lot!
In case you could help, now I'm having the following issue:
At the store step, I am getting a "Segmentation fault 11 / Core dumped". I also ran the script in the shared machine that we have (big memory) in case it is a storage issue, but the error persists, so it must be one of the arguments that I am calling and I suspect that it is the String(sirius_tmp.getTmpDir()), because when I run it on its own I am getting this error:
String(sirius_tmp.getTmpDir()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __repr__ returned non-string (type bytes)
I don't think this is something a String() class can do. Or is there something I am missing?
TypeError: __repr__ returned non-string (type bytes)
There seems to be an error with the type the algorithm is getting.
In python you could use a print statement to check what is the current return value:
e.g. print(String(sirius_tmp.getTmpDir()))
It will probably give you a byte string (b' ') which is a type in python 3, this type can be converted to a regular string.
You can try to decode the byte string via decode('utf-8')
This should then look somewhat like this:
String(sirius_tmp.getTmpDir()).decode('utf-8')
Let me know if that works!
Oh yeah that's exactly what it gives me (b' ') !
I tried String(sirius_tmp.getTmpDir()).decode('utf-8')
but I am getting an AttributeError: 'pyopenms.pyopenms_8.String' object has no attribute 'decode'
hm, for me it works with
String(sirius_tmp.getTmpDir())
code:
# construct sirius ms file object
msfile = SiriusMSFile()
# fill variables, which are used in the function
argument1 = exp
# TODO: need to construct the feature mapping
feature_mapping = FeatureMapping_FeatureToMs2Indices()
feature_only = True #SiriusAdapterAlgorithm.getFeatureOnly()==True
isotope_pattern_iterations = 3
no_mt_info = False
compound_info = [] #SiriusMSFile_CompoundInfo()
#this is a parameter, which is called "feature_only"
#It is a boolean value (true/false) and if it is true you are using the the feature information
#from in_featureinfo to reduce the search space to MS2 associated with a feature.
#this is recommended when working with featureXML input, if you do NOT use it
#sirius will use every individual MS2 spectrum for estimation (and it will take ages)
#bool feature_only = (sirius_algo.getFeatureOnly() == "true") ? true : false;
#SiriusAdapterAlgorithm.getNoMasstraceInfoIsotopePattern() == False
print(sirius_tmp.getTmpDir())
print(String(sirius_tmp.getTmpDir()))
msfile.store(exp,
String(sirius_tmp.getTmpDir()), # has to be converted to an "OpenMS::String"
feature_mapping,
feature_only,
isotope_pattern_iterations,
no_mt_info,
compound_info)
terminal output:
/private/var/folders/t7/x82jn_jd09vc_hlq_sqs2jlw0000gn/T/20210302_173419_Olivers-MBP.fritz.box_18774_1
b'/private/var/folders/t7/x82jn_jd09vc_hlq_sqs2jlw0000gn/T/20210302_173419_Olivers-MBP.fritz.box_18774_1'
.
<ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!> occurred 1081 times
..
Warning: a significant portion of your decharged molecules have gapped, even-numbered charge ladders (42 of 422)This might indicate a too low charge interval being tested.
.
<..> occurred 2 times
No MS1 spectrum for this precursor. Occurred 0 times.
0 spectra were skipped due to precursor charge below -1 and above +1.
Mono charge assumed and set to charge 1 with respect to current polarity 696 times.
0 features were skipped due to feature charge below -1 and above +1.
Currently, the feature_mapping is still missing, which has to be performed in the preprocessing step, that is why you do not get any output yet.
Ah ok, you updated the script - let me check the newest version.
It is the same. Don't bother :D Alright, I misunderstood. I thought the feature_mapping was constructed in the preprocessing step
No, you did not misunderstand. It is constructed in the preprocessing step.
I was currently looking on my branch the one in the PR - and there everything worked with the ".ms" file, based on String(sirius_tmp.getTmpDir())
.
I will check your master branch now.
ahh I See :) okok!
1) We are using the wrong function :D
String(sirius_tmp.getTmpDir())
<- point to the temporary directory, which is used in the QProcessCall
String(sirius_tmp.getTmpMsFile())
<- is the correct function to get the temporary ms file
But it seems I still get a segfault and I am not sure why.
yes, I tried that one too, since that was the one I saw in the cpp, but I am also getting the same output :/ segfault 11
Ok, I will try to debug it - this might be an issue with the wrapping between c++ and python. So not much you can do at the moment I guess.
With String(sirius_tmp.getTmpMsFile())
is it the exact same error as you posted above?
No, it doesn't actually! It accepts String(sirius_tmp.getTmpMsFile())
Before you try to debug it: I'm running the script with a different file right now (Leupeptin.mzML) and it is running for the past 5min without giving me a segfault! I didn't add any print command so not sure where it is right now but def after the deconvolution step! I ll let you know when I get an output.
hm, ok that is good news actually - did you change anything else? Could you update your current master branch?
yes, I updated it! No, nothing else. Only the file! It's so annoying, it's the 2nd time this happens this month.
I think the mzML files are just not very consistent. Not sure what is going on exactly, but when I convert it through my bash shell I am getting seg faults in pyopenms. When I convert it through the UI (proteowizard) I am not usually getting this error, until today.
Hm, ok - great that it works with the "Leupeptin.mzML", so you can proceed with your development of the pyopenms pipeline!
I think I will still take a look at the C++ side when I have time with the other mzML, since if the program segfaults, there is usually something wrong. For example, an edge case that is not handled correctly.
Great, I will let you know how it goes! For now, it's still not even preprocessed yet.
I think the mzML files are just not very consistent. Not sure what is going on exactly, but when I convert it through my bash shell I am getting segfaults in pyopenms. When I convert it through the UI (proteowizard) I am not usually getting this error, until today.
You could try to use the FileConverter from mzML to mzML after conversion via shell/proteowizard. That is not super convenient, but it might correct issues with the proteowizard files.
Really?? So weird. I will give it a shot. I suspect that if you don't feed it with specific parameters (mscovert Leupeptin.raw --mzML --centroid or something) it just gives you a diverse result. Not sure. I will look into it tomorrow. Thank you for today and have a good night!
No Problem! Have a good night!
PS: You have to be sure what kind of spectra are centroided in the conversion process. If you have already measured the MS2 spectra in centroid mode and are centroiding them again in the conversion process this may lead to corrupt MS2 spectra.
Yes, I figured it out when I started building the workflow. Btw, it didn't run. I think it's stuck. I will take a look at it tomorrow step by step and see where the problem is.
Ok, it might be worth to check if the default parameters are set for the algorithms, or to set them. Let me know how it goes!
Edit: It seems to work without any issue on the c++ side with
-executable /Users/alka/Documents/work/software/sirius-osx64-4.0.1/bin/sirius
-in /Users/alka/Desktop/tests_and_issues/DTU_Efi/siriusadapter_test/GermicidinAstandard.mzML
-in_featureinfo /Users/alka/Desktop/tests_and_issues/DTU_Efi/siriusadapter_test/devoncoluted_GermicidinAstandard.featureXML
-out_ms /Users/alka/Desktop/tests_and_issues/DTU_Efi/siriusadapter_test/GermicidinAstandard_out_sirius.ms
-converter_mode
-preprocessing:filter_by_num_masstraces 3
-preprocessing:feature_only
I see. So this could be a python-wrapper issue or a parameter problem?
I opened the file (GermicidinAstandard_out_sirius.ms.zip) and actually the correct mass is missing unfortunately. So I def need to play around with the parameters. Germicidin A is 196.109945 (neutral mass). I can find the M+H exact mass at the FeatureFindingMetabo.featureXML file (197.1185) feature id="f_14619441151854324250". But that's it. I m looking into the deconvoluted.featureXML and it seems actually identical to the FeatureFindingMetabo.featureXML
Ok, I have a theory after a day long of trying to understand what is going on:
The files are mostly ok - the ones I am converting using the MSConvert -GUI.
However, the MetaboliteFeatureDeconvolution step is problematic. It basically generates the exact same file as the FeatureFindingMetabo() or it gets completely stuck when I try a file larger than Germicidin A (e.g. Leupeptin).
At the same time, I tried the workflow in TOPPAS (using the germicidin A file either raw or I converted it in TOPPAS) and again, it crashes at the MetaboliteAddctDecharger step!
15:19:45 ERROR: MetaboliteAdductDecharger crashed!
So strange!
What adducts are set in the parameters, when you run the MetaboliteAdductDecharger? It might be that the search space gets too big.
I would suggest, that you optimize the parameters for the feature detection step (FFM) and then try to run MAD again.
The difference between the FFM featureXML and the MAD featureXML should be that adducts are annotated.
Could it be that you store the wrong FeatureMap by mistake?
deconv = MetaboliteFeatureDeconvolution()
f_out = FeatureMap()
cons_map0 = ConsensusMap()
cons_map1 = ConsensusMap()
deconvoluted = deconv.compute(feature_map, f_out, cons_map0, cons_map1)
deconvol = FeatureXMLFile()
deconvol.store("./wf_testing/devoncoluted.featureXML", feature_map)
should the last step be:
deconvol.store("./wf_testing/devoncoluted.featureXML", f_out)
I had deconvol.store("./wf_testing/devoncoluted.featureXML", feature_map)
and switched it to deconvol.store("./wf_testing/devoncoluted.featureXML", f_out)
Just now to check the differences! I will change it back!
And I will take a look at the parameters. That could be the problem!
ff = FeatureFindingMetabo()
ff.run(mass_traces_split,
feature_map,
mass_traces_filtered)
feature_map is used in the FFM to store the information.
You use that information again, and save the new information in f_out
.
deconvoluted = deconv.compute(feature_map, f_out, cons_map0, cons_map1)
This means: f_out
should be the correct one, could you compare feature_map
and your f_out
?
The one filled in the algorithm should have data about the adduct(s).
You can check that easily by using diff file1 file2
in the terminal.
Try to go over the code step by step, in some cases it might help to rename the variables in away that you know where they come from.
e.g. feature_map_ffm; feautre_map_dec
edit: Another option would be to run the tools via the command line or KNIME and then try to reproduce it with the python script, then you see what the output should look like in the first place in if it runs in KNIME/command line it should also run in pyopenms, unless there is a wrapping error.
bash output up to deconvolution:
`Generating Masses with threshold: -8.9872 ...
done
6705 of 17271 valid net charge compomer results did not pass the feature charge constraints
Inferring edges raised edge count from 14308 to 36066
Found 36066 putative edges (of 180481) and avg hit-size of 0.716575
Using solver 'coinor' ...
Optimal solution found!
<Using solver 'coinor' ...> occurred 50 times
Branch and cut took 40.9798 seconds, with objective value: 0.511625.
<Optimal solution found!> occurred 50 times
ILP score is: 0.511625
Agreeing charges: 367/2922
ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!
<ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!> occurred 1081 times
..
Warning: a significant portion of your decharged molecules have gapped, even-numbered charge ladders (42 of 422)This might indicate a too low charge interval being tested.
<..> occurred 2 times`
diff between the files:
` value="[851.384000000000015]"/>
> <UserParam type="floatList" name="masstrace_centroid_mz" value="[142.119051803574308]"/>
> 37980a47822,47828
> <UserParam type="int" name="map_idx" value="0"/>
> <UserParam type="string" name="dc_charge_adducts" value="H1"/>
> <UserParam type="stringList" name="adducts" value="[[M+H]+]"/>
> <UserParam type="float" name="dc_charge_adduct_mass" value="1.0078250319"/>
> <UserParam type="int" name="is_backbone" value="1"/>
> <UserParam type="int" name="old_charge" value="0"/>
> <UserParam type="string" name="Group" value="3769097129665771319"/>
> 37982,37985c47830,47833
< <feature id="f_18258132156436287456">
< <position dim="0">697.333999999999946</position>
< <position dim="1">774.59791474306428</position>
< <intensity>1.285593e04</intensity>
---
> <feature id="f_4885285558678229880">
> <position dim="0">854.868000000000052</position>
> <position dim="1">118.087559968126442</position>
> <intensity>3.775919e04</intensity>
> 37988c47836
< <overallquality>3.980958e-05</overallquality>
---
> <overallquality>1.169248e-04</overallquality>
> 37990,37991c47838,47839
< <UserParam type="string" name="label" value="T732.2"/>
< <UserParam type="float" name="FWHM" value="3.439255952835083"/>
---
> <UserParam type="string" name="label" value="T961.14"/>
> <UserParam type="float" name="FWHM" value="13.10629940032959"/>
> 37993,37995c47841,47843
< <UserParam type="floatList" name="masstrace_intensity" value="[1.285592999999993e04]"/>
< <UserParam type="floatList" name="masstrace_centroid_rt" value="[697.333999999999946]"/>
< <UserParam type="floatList" name="masstrace_centroid_mz" value="[774.59791474306428]"/>
---
> <UserParam type="floatList" name="masstrace_intensity" value="[3.775918800000008e04]"/>
> <UserParam type="floatList" name="masstrace_centroid_rt" value="[854.868000000000052]"/>
> <UserParam type="floatList" name="masstrace_centroid_mz" value="[118.087559968126442]"/>
> 37997a47846,47847
> <UserParam type="string" name="Group" value="18297541040978580586"/>
> <UserParam type="int" name="is_ungrouped_monoisotopic" value="1"/>
37999,38002c47849,47852
< <feature id="f_8593042559675873728">
< <position dim="0">704.301000000000045</position>
< <position dim="1">776.236362935943248</position>
< <intensity>4.351943e04</intensity>
---
(and a lot more lines)
> <UserParam type="string" name="dc_charge_adducts" value="H1"/>
> <UserParam type="stringList" name="adducts" value="[[M+H]+]"/>
This should be seen after decovolution. It basically annotated an adduct at for this feature. Which command-line tool are you using?
diff FeatureFindingMetabo.featureXML devoncoluted.featureXML -y
If I look into the file, with ctrl F , I can detect what you wrote to me. So that's good
edit: Another option would be to run the tools via the command line or KNIME and then try to reproduce it with the python script, then you see what the output should look like in the first place in if it runs in KNIME/command line it should also run in pyopenms, unless there is a wrapping error.
I will try that :)
UPDATES:
Sirius works great through KNIME. I am now converting all the files to centroid data using the command line:
msconvert --zlib --filter "peakPicking true [1 ,2]" --ignoreUnknownInstrumentError
and the results are consistent. However, I double checked all my script and it looks ok to me, except 2 variables.
This is my train of thought: I am always having the following output:
Generating Masses with threshold: -8.9872 ...
done
1217 of 3702 valid net charge compomer results did not pass the feature charge constraints
Inferring edges raised edge count from 3418 to 8518
Found 8518 putative edges (of 20230) and avg hit-size of 0.674454
Using solver 'coinor' ...
Optimal solution found!
<Using solver 'coinor' ...> occurred 16 times
Branch and cut took 2.76753 seconds, with objective value: 1.21756.
<Optimal solution found!> occurred 16 times
ILP score is: 1.21756
Agreeing charges: 97/776
ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!
preprocessed
.
<ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!> occurred 477 times
..
Warning: a significant portion of your decharged molecules have gapped, even-numbered charge ladders (12 of 117)This might indicate a too low charge interval being tested.
.
<..> occurred 2 times
Number of features to be processed: 80
Number of additional MS2 spectra to be processed: 4169
checked
Segmentation fault: 11
As far as I understand, a segfault will occur when there's a memory issue or when something is divided by zero (or anyway where it doesn't expect zero, it finds zero). I can see from the output: ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!> occurred 477 times
Which could be the problem. I searched what ConsensusFeature refers to and it is possibly linked to the preprocessing step and the feature_mapping construction.
KDTreeFeatureMaps fp_map_kd; // reference to *basefeature in vector<FeatureMap> ``FeatureMapping::FeatureToMs2Indices feature_mapping; // reference to *basefeature in vector<FeatureMap>
featureinfo= "./wf_testing/devoncoluted.featureXML"
spectra= exp
v_fp= []
fp_map_kd= KDTreeFeatureMaps()
sirius_algo= SiriusAdapterAlgorithm()
feature_mapping = FeatureMapping_FeatureToMs2Indices()
sirius_algo.preprocessingSirius(featureinfo,
spectra,
v_fp,
fp_map_kd,
sirius_algo,
feature_mapping)
I think that feature_mapping needs to be somehow linked to the *basefeature in v_fp vector? Does this make sense? I will look into it.
I think the ConsensusFeature in this case has nothing to do with the preprocessing.
The problem with the error message is that it is somehow delayed.
Generating Masses with threshold: -8.9872 ... <- MetaboliteAdductDecharger
done <- MetaboliteAdductDecharger
1217 of 3702 valid net charge compomer results did not pass the feature charge constraints <- MetaboliteAdductDecharger
Inferring edges raised edge count from 3418 to 8518 <- MetaboliteAdductDecharger
Found 8518 putative edges (of 20230) and avg hit-size of 0.674454 <- MetaboliteAdductDecharger
Using solver 'coinor' ... <- MetaboliteAdductDecharger
Optimal solution found! <- MetaboliteAdductDecharger
<Using solver 'coinor' ...> occurred 16 times <- MetaboliteAdductDecharger
Branch and cut took 2.76753 seconds, with objective value: 1.21756. <- MetaboliteAdductDecharger
<Optimal solution found!> occurred 16 times <- MetaboliteAdductDecharger
ILP score is: 1.21756 <- MetaboliteAdductDecharger
Agreeing charges: 97/776 <- MetaboliteAdductDecharger
ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0! <- MetaboliteAdductDecharger
preprocessed
. <- MetaboliteAdductDecharger
<ConsensusFeature::computeDechargeConsensus() WARNING: Feature's charge is 0! This will lead to M=0!> occurred 477 times <- MetaboliteAdductDecharger
.. <- MetaboliteAdductDecharger
Warning: a significant portion of your decharged molecules have gapped, even-numbered charge ladders (12 of 117)This might indicate a too low charge interval being tested. <- MetaboliteAdductDecharger
.<- MetaboliteAdductDecharger
<..> occurred 2 times <- MetaboliteAdductDecharger
Number of features to be processed: 80 <- SiriusAdapter
Number of additional MS2 spectra to be processed: 4169 <- SiriusAdapter
checked
Segmentation fault: 11 <- SiriusAdapter
For example, the statement below is already produced by the checkFeatureSpectraNumber
https://github.com/OpenMS/OpenMS/blob/develop/src/openms/source/ANALYSIS/ID/SiriusAdapterAlgorithm.cpp#L296
Number of features to be processed: 80
Number of additional MS2 spectra to be processed: 4169
This means that the mapping has worked. I think I would take another look at the error with the Traceback. Could you please let me know how the Traceback looks like. Unfortunately, it is pretty hard to debug the python C++ interface.
Edit: You could also try to reduce the complexity of the dataset for SIRIUS to see if that has an impact by setting the parameter "preprocessing:filter_by_num_masstraces" to 3 (for example.)
Ok I see. Yes I am starting to play around with the parameters now, let's see how this works out :)
Please let me know the Traceback of the error you are getting, then I will try to debug it (probably beginning of next week).
Trying to figure it out. I would normally get a traceback error in the output, but I am not getting anything even with calling import traceback traceback.print_tb(tb, limit=None, file=None)
Not getting any traceback Oliver. It's just the segfault and it doesn't allow me to trace the error. If you are talking about the ConsensusFeature, this is a warning, not a traceback. Does that answer your Traceback request? Or am I way off? :D
No worries, I am not talking about the ConsensusFeature, but the segfault - as you have guessed correctly.
It is really strange that you are getting a segfault 11 - which would mean out of memory and that it works without any issues in KNIME.
I think you should work on the parameters and I will try to figure out what goes wrong at the interface level.
For me the segfault looks as follows:
Number of features to be processed: 169
Number of additional MS2 spectra to be processed: 506
checked
[1] 88359 segmentation fault /usr/local/miniconda3/envs/build_pyopenms_39/bin/python
Yes, I will definitely work on the parameters.
For me it's just : Segmentation fault: 11
Nothing else..
Can you check the available memory on the machine and the size of the /tmp directory? Can you also check if there are a lot of .ms files in the /tmp?
The memory shouldn't be an issue, because I ran the workflow in our shared machine last week when this error first appeared and I got the exact same issue. The memory on my Mac is only 8GB and 6,56 GB are being used. I also would think this is the issue. Let me try again on the shared machine.
/PID / USER / PR / NI / VIRT / RES / SHR S / %CPU /%MEM / TIME+ COMMAND / 28962 / eeko / 20 / 0 / 2157688 / 184524 / 57132 R / 329.7 / 0.3 / 0:09.99 python
This is the peak memory usage in the shared machine, which would normally have: MemTotal: 65882684 kB MemFree: 14907788 kB MemAvailable: 49163088 kB
I ran it 4 times and 1 out of 4 I got this error lines before the segmentation fault:
native_id: scanId=1106230 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=1106330 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=1106429 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=61098 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=61197 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=61297 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=63101 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=63201 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
native_id: scanId=63300 accession: MS:1001508 Could not extract scan number - no valid native_id_type_accession was provided
Segmentation fault (core dumped)
Hm, ok, which machine are you using and what kind of filetype are the raw files?
It's a server that we have in the group with 62GiB System memory processor: Intel(R) Xeon(R) CPU E5-1660 v4 @ 3.20GHz
The raw files are initially .raw (from an Orbitrap fusion ID-X) and converted to mzML (centroid data). Is that what you mean?
Das schaut ganz schön faul aus. Welches example file benutzt du gerade?
GermicidinAstandard.mzML:
print("Loading")
MzMLFile().load("Standards/GermicidinAstandard.mzML", exp)
print("Loaded")
print(exp.getSourceFiles()[0].getNativeIDTypeAccession())
print(exp.getSourceFiles()[0].getNativeIDType())
MS:1000772
Bruker BAF nativeID format
Oh my bad. That one was from bruker :D
https://github.com/eeko-kon/py4e/blob/master/Workflownew.py#L56
This comes from the SiriusMSFile Class, since you would like to store a .ms file (internally - in memory). https://github.com/OpenMS/OpenMS/blob/develop/src/pyOpenMS/pxds/SiriusMSFile.pxd
python:
C++:
In general, you can check the parameter also in the documentation if you do not know what it is doing and why? https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Documentation/nightly/html/UTILS_SiriusAdapter.html