PyProphet / pyprophet

PyProphet: Semi-supervised learning and scoring of OpenSWATH results.
http://www.openswath.org
BSD 3-Clause "New" or "Revised" License
29 stars 21 forks source link

sqlite3.Integrity Error #98

Closed fangfeiz closed 3 years ago

fangfeiz commented 3 years ago

image

pyprophet=...../pyprophet
for run in *.osw 
do
$pyprophet score --in=$run --apply_weights=model.osw --level=ms1ms2
done

for run in *.osw 
do
run_reduced=${run}r 
$pyprophet reduce --in=$run --out=$run_reduced
done

$pyprophet merge --template=X_decoys.PQP --out=model_global.osw *.oswr

$pyprophet peptide --context=global --in=model_global.osw

$pyprophet protein --context=global --in=model_global.osw

for run in *.osw 
do
run_reduced=${run}r
$pyprophet backpropagate --in=$run --apply_scores=model_global.osw
done

for run in *.osw 
do
 $pyprophet export --in=$run --max_global_peptide_qvalue=0.05 --max_global_protein_qvalue=0.05  
done

Hi, we used the our routine command lines to run PyProphet but we received the above error for this 6h data. Could you please give us a hint of this error? Is the problem coming from osw results or the PQP library?

grosenberger commented 3 years ago

What happens when you exclude the last file *_X01_T0.oswr? Does it run through?

fangfeiz commented 3 years ago

image We re-ran without the last file but still get the error.

grosenberger commented 3 years ago

Could you try to execute each block separately? And add a print statement to each for loop to ensure that only the appropriate files (e.g. not the model_global.osw file itself) is being processed?

biogreenhand commented 3 years ago

hello,we are in the same group. I follow your hints to execute the code below separately and I found the error present in the process of "pyprophet merge" . What's the possible reason? *.osw:only above 3 files

pyprophet=/pdiskdata/sunrui/miniconda3/envs/openms/bin/pyprophet
#decoy.PQP=$1

$pyprophet merge --out=model.osw --template=/pdiskdata/sunrui/ToolComp/E_L/E_decoys.PQP *.osws && \

$pyprophet score --in=model.osw --level=ms1ms2 --threads=24

image image

for run in *.osw   
do
$pyprophet score --in=$run --apply_weights=model.osw --level=ms1ms2
done

Info: Applying weights. Warning: Column var_mi_ratio_score contains only invalid/missing values. Column will be dropped. Warning: Column var_elution_model_fit_score contains only invalid/missing values. Column will be dropped. Warning: Column var_im_xcorr_shape contains only invalid/missing values. Column will be dropped. Warning: Column var_im_xcorr_coelution contains only invalid/missing values. Column will be dropped. Warning: Column var_im_delta_score contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_lag contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_shape contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_log_sn contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_log_diff contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_log_trend contains only invalid/missing values. Column will be dropped. Warning: Column var_sonar_rsq contains only invalid/missing values. Column will be dropped. Warning: Column var_ms1_mi_score contains only invalid/missing values. Column will be dropped. Warning: Column var_ms1_im_ms1_delta_score contains only invalid/missing values. Column will be dropped. Warning: Column var_ms1_xcorr_coelution contains only invalid/missing values. Column will be dropped. Warning: Column var_ms1_xcorr_shape contains only invalid/missing values. Column will be dropped. Info: Data set contains 199651 decoy and 309115 target groups. Info: Summary of input data: Info: 543676 peak groups Info: 508766 group ids Info: 32 scores including main score Info: Start application of pretrained weights. Info: Finished pretrained scoring. Info: Data set contains 199651 decoy and 309115 target groups. Info: Mean qvalue = 8.346956e-01, std_dev qvalue = 1.755231e-01 Info: Mean svalue = 9.591734e-01, std_dev svalue = 1.350082e-01 Info: Finished scoring and estimation statistics. Info: Finished processing of input data. Info: Time needed: 00:00:8.5

qvalue pvalue svalue ... fp fn cutoff 0 0.00 0.000005 0.055256 ... 1.464562 15790.288091 19.654234 1 0.01 0.000070 0.120768 ... 20.503862 14695.327391 15.819921 2 0.02 0.000180 0.150371 ... 52.724216 14200.547745 14.054235 3 0.05 0.000616 0.205211 ... 180.141070 13283.964599 10.648964 4 0.10 0.001628 0.256316 ... 475.982501 12429.806031 7.997010 5 0.20 0.005284 0.370098 ... 1545.112427 10528.935956 4.874388 6 0.30 0.011455 0.466952 ... 3349.452247 8909.275777 3.539418 7 0.40 0.023256 0.610797 ... 6799.959241 6513.782770 2.405653 8 0.50 0.044523 0.778967 ... 13018.487549 3694.311078 1.511578

[9 rows x 12 columns]

Info: G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.osw written. Info: G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0_ms1ms2_report.pdf written.

for run in *.osw 
do
run_reduced=${run}r 
$pyprophet reduce --in=$run --out=$run_reduced
done

Info: OSW file was reduced for multi-run scoring. Info: OSW file was reduced for multi-run scoring. Info: OSW file was reduced for multi-run scoring. Info: OSW file was reduced for multi-run scoring.

$pyprophet merge --template=X_decoys.PQP --out=model_global.osw *.oswr

Info: Merged runs of file G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.oswr to model_global.osw. Info: Merged runs of file G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC2_MHRM_R01_T0.oswr to model_global.osw. Info: Merged runs of file G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC3_MHRM_R01_X01_T0.oswr to model_global.osw. Traceback (most recent call last): File "/pdiskdata/sunrui/miniconda3/envs/openms/bin/pyprophet", line 8, in sys.exit(cli()) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/click/core.py", line 1289, in invoke rv.append(sub_ctx.command.invoke(sub_ctx)) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, *kwargs) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/pyprophet/main.py", line 271, in merge merge_osw(infiles, outfile, templatefile, same_run) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/pyprophet/levels_contexts.py", line 494, in merge_osw merge_oswr(infiles, outfile, templatefile, same_run) File "/pdiskdata/sunrui/miniconda3/envs/openms/lib/python3.8/site-packages/pyprophet/levels_contexts.py", line 724, in merge_oswr c.executescript('ATTACH DATABASE "%s" AS sdb; INSERT INTO RUN SELECT FROM sdb.RUN; DETACH DATABASE sdb;' % infile) sqlite3.IntegrityError: UNIQUE constraint failed: RUN.ID

grosenberger commented 3 years ago

Could you please upload a minimal example so I can try to reproduce the issue?

biogreenhand commented 3 years ago

OK,thanks very much! These are my files: libarary and iRT file: https://1drv.ms/u/s!Ag0Rm2-bcc9ogync87l4ZFPX57g_?e=s5CRr3 osw data: https://1drv.ms/u/s!Ag0Rm2-bcc9ogysdo1i8THoYAiRJ?e=0rEprb you can download it on onedirve.

biogreenhand commented 3 years ago

Hello, I've already reply you on github. I upload the files by this email again. I wish your reply sincercely. Thank you very much!

At 2021-07-13 18:14:23, "George Rosenberger" @.***> wrote:

Could you please upload a minimal example so I can try to reproduce the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

从网易163邮箱发来的超大附件 library.zip (174.06M, 2021年8月4日 16:07 到期) 下载 rawosw.zip (423.37M, 2021年8月4日 16:07 到期) 下载

grosenberger commented 3 years ago

This is only a single OSW run, however, the merging step of multiple files actually fails, right? For a minimum example, could you please provide three input files of the pyprophet merge step, including the exact used command that results in the issue?

biogreenhand commented 3 years ago

At 2021-07-20 16:47:22, "George Rosenberger" @.***> wrote:

This is only a single OSW run, however, the merging step of multiple files actually fails, right? For a minimum example, could you please provide three input files of the pyprophet merge step, including the exact used command that results in the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

从网易163邮箱发来的超大附件 rawdosw.tar.gz (1.26G, 2021年8月4日 17:15 到期) 下载 openswath.sh (904B, 2021年8月4日 17:32 到期) 在线预览 | 下载 pyprophet1_subsample.sh (293B, 2021年8月4日 17:32 到期) 在线预览 | 下载 pyprophet2.sh (1.27K, 2021年8月4日 17:32 到期) 在线预览 | 下载

grosenberger commented 3 years ago

The links are unfortunately not accessible.

biogreenhand commented 3 years ago

Could you download these appendix ZIP files ?

-------- 转发邮件信息 -------- 发件人:"hllnbdx" @.> 发送日期:2021-07-20 17:32:31 收件人:PyProphet/pyprophet @.> 抄送人:PyProphet/pyprophet @.>,Comment @.> 主题:Re:Re: [PyProphet/pyprophet] sqlite3.Integrity Error (#98)

At 2021-07-20 16:47:22, "George Rosenberger" @.***> wrote:

This is only a single OSW run, however, the merging step of multiple files actually fails, right? For a minimum example, could you please provide three input files of the pyprophet merge step, including the exact used command that results in the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

从网易163邮箱发来的超大附件 rawdosw.tar.gz (1.26G, 2021年8月4日 17:15 到期) 下载 openswath.sh (904B, 2021年8月4日 17:32 到期) 在线预览 | 下载 pyprophet1_subsample.sh (293B, 2021年8月4日 17:32 到期) 在线预览 | 下载 pyprophet2.sh (1.27K, 2021年8月4日 17:32 到期) 在线预览 | 下载

grosenberger commented 3 years ago

No, I think you need to provide a link on the GitHub page directly.

biogreenhand commented 3 years ago

Could you access the onedrive link below? [https://1drv.ms/u/s!AjB4EYuo5rbCjSWFQnWxFxdh8hLY?e=axg10F] (https://1drv.ms/u/s!Ag0Rm2-bcc9ogync87l4ZFPX57g_?e=s5CRr3)

grosenberger commented 3 years ago

I unfortunately can't access the second file. And the first file only contains the three OSW files. A minimal reproducible example should include the following:

biogreenhand commented 3 years ago

Sorry, I generate new link again. Library PQP file and irt files: https://1drv.ms/u/s!Ag0Rm2-bcc9ogyygO7liBW4nvE4c?e=1OYgnn commands.sh: https://1drv.ms/u/s!Ag0Rm2-bcc9ogzHxUlO7L1DmOPFv?e=ZIPBGZ . It includes the OpenSwath commands I used to generate these three OSW files. You can ignore it. As for the pyprophet commands, you can use it to test the 3 osws files. OpenSWATH OSW files: directly from OpenSWATH, and you have received them.

grosenberger commented 3 years ago

Ok, so I could download all data and rerun the analysis. It worked without issues, however within your script, make sure that the for-loops are only applied to the run files (for run in G_D181026_S553*.osw) and not the model.osw file (for run in *.osw <- this would include the model.osw file):

pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.osw
pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC2_MHRM_R01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC2_MHRM_R01_T0.osw
pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC3_MHRM_R01_X01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC3_MHRM_R01_X01_T0.osw

pyprophet merge --out=model.osw --template=E_decoys.PQP *.osws

pyprophet score --in=model.osw --level=ms1ms2 --threads=4

for run in G_D181026_S553*.osw 
do
pyprophet score --in=$run --apply_weights=model.osw --level=ms1ms2
done

for run in G_D181026_S553*.osw 
do
run_reduced=${run}r 
pyprophet reduce --in=$run --out=$run_reduced
done

pyprophet merge --template=E_decoys.PQP --out=model_global.osw *.oswr

pyprophet peptide --context=global --in=model_global.osw

pyprophet protein --context=global --in=model_global.osw

for run in G_D181026_S553*.osw 
do
run_reduced=${run}r
pyprophet backpropagate --in=$run --apply_scores=model_global.osw
done

for run in G_D181026_S553*.osw 
do
pyprophet export --in=$run --max_global_peptide_qvalue=0.05 --max_global_protein_qvalue=0.05 
done
biogreenhand commented 3 years ago

Ok, so I could download all data and rerun the analysis. It worked without issues, however within your script, make sure that the for-loops are only applied to the run files (for run in G_D181026_S553*.osw) and not the model.osw file (for run in *.osw <- this would include the model.osw file):

pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC1_MHRM_R01_T0.osw
pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC2_MHRM_R01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC2_MHRM_R01_T0.osw
pyprophet subsample  --subsample_ratio=0.3 --out=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC3_MHRM_R01_X01_T0.osws --in=G_D181026_S553-CSH60cm-6h-4ug-W70Res60-TC3_MHRM_R01_X01_T0.osw

pyprophet merge --out=model.osw --template=E_decoys.PQP *.osws

pyprophet score --in=model.osw --level=ms1ms2 --threads=4

for run in G_D181026_S553*.osw 
do
pyprophet score --in=$run --apply_weights=model.osw --level=ms1ms2
done

for run in G_D181026_S553*.osw 
do
run_reduced=${run}r 
pyprophet reduce --in=$run --out=$run_reduced
done

pyprophet merge --template=E_decoys.PQP --out=model_global.osw *.oswr

pyprophet peptide --context=global --in=model_global.osw

pyprophet protein --context=global --in=model_global.osw

for run in G_D181026_S553*.osw 
do
run_reduced=${run}r
pyprophet backpropagate --in=$run --apply_scores=model_global.osw
done

for run in G_D181026_S553*.osw 
do
pyprophet export --in=$run --max_global_peptide_qvalue=0.05 --max_global_protein_qvalue=0.05 
done

Oh, it's my fault! Thank you very much!