Closed mafreitas closed 7 years ago
Hi,
I m testing your PR. I saw that you multiplied by 60 all the retention time related parameters such as RT_P and rt_W . What is the unit of the RT in your input file ? Minutes or second ?
I always assumed that RT in Thermo Raw converted mzXML file are in minutes ? May I wrong ? Maybe this is not true for other vender file format .
I had to change the time so that it would work with our peptide output (using our in house build MassMatrix search engine) are in minutes. Data conversions were done with proteowizard.
Hi, I merged your pull request on the master.
From your speedup I also included some other modifications:
restored the input parameter (rt_w and rt_p,rt_p_match) in the second .
The first read and the saving of the rt values and scan_ids from the mzML file is done in the function scan_mzml
def scan_mzml ( name ):
if ('MZML' in name.upper()):
rt_list = []
runid_list = []
run_temp = pymzml.run.Reader( name )
for spectrum in run_temp:
if spectrum['ms level'] == 1:
rt_list.append(spectrum['scan start time'])
runid_list.append(spectrum['id'])
return (rt_list,runid_list )
else:
# in case of raw file I put to -1 -1 the result
return (-1,-1 )
that is called just one time before the processing of the input file in multithreding. This implementation gives a bit of time improvements with respect to your original where each thread has to read and scan the file. It should be nice also to read the mzml file just one time and pass it to directly to
result[df_index] = myPool.apply_async(apex_multithr, args=(
data_split[df_index], name, args.raw_list, tol, h_rt_w, s_w, s_w_match, loc_raw, loc_output, offset, rt_list , id_list ))
but I had problem to problem to pass a pymzml.run.Reader() object to the pool. I will left this improvement for the future
I also modified the moff_all.py in the same way in oreder to have the speedup when user call the mbr and apex.
I willm appreciate if you can do some test on this version in order to spot eventually bugs. Thank you agian for your contribution. Andrea
Here are the changes