BGI2016 / IntroSpect

A motif-guided immunopeptidome database building tool to improve the sensitivity of HLA binding peptide identification
Apache License 2.0
2 stars 0 forks source link

error running test scripts #1

Open peterthorpe5 opened 2 years ago

peterthorpe5 commented 2 years ago

Dear IntroSpect, I am running the test scripts, It has failed on (all steps up to this point ran ok. :
Runing D.SearchSpace_Filtering.sh

with the following errors:

No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29.

FULL OUTPUT:

sh ./analysis/0.shell/RUNALL.sh Mon Jan 10 15:29:13 GMT 2022 Runing A.Peptides_Cluster.sh

Call: /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/gibbscluster-2.0/GibbsCluster-2.0e_SA.pl -H /shelf/apps/pjt6/conda/envs/trinity/bin/R -f /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/./analysis/1.cluser//test-1000.9-11mer.txt -g 1-6 -P test-1000.9-11mer -R /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/./analysis/1.cluser//test-1000.9-11mer -C -T -j 2 -l 9 -S 5 -I 0 -D 2 -k 2

Mon Jan 10 15:29:13 2022

Session ID: 1637

Run name: test-1000.9-11mer_1637

Read 996 unique sequences

Settings:

No shift moves, cluster move at every iteration

Number of clusters: 1 - 6

Motif length: 9

Initial MC temperture: 1.5

Number of temperature steps: 20

Number of iterations x Sequence x Tstep: 10

Max insertion length: 0

Max deletion length: 2

Interval between Indel moves: 10

Number of initial seeds: 5

Penalty lambda: 0.8

Weight on small clusters: 5

Sequence weighting type: 0

Use trash cluster to remove outliers: 1

Threshold for trash cluster: 2

Running Gibbs clustering...

Clustering with 1 groups, seed number 1

Clustering with 1 groups, seed number 2

Clustering with 1 groups, seed number 3

Clustering with 1 groups, seed number 4

Clustering with 1 groups, seed number 5

Clustering with 2 groups, seed number 1

Clustering with 2 groups, seed number 2

Clustering with 2 groups, seed number 3

Clustering with 2 groups, seed number 4

Clustering with 2 groups, seed number 5

Clustering with 3 groups, seed number 1

Clustering with 3 groups, seed number 2

Clustering with 3 groups, seed number 3

Clustering with 3 groups, seed number 4

Clustering with 3 groups, seed number 5

Clustering with 4 groups, seed number 1

Clustering with 4 groups, seed number 2

Clustering with 4 groups, seed number 3

Clustering with 4 groups, seed number 4

Clustering with 4 groups, seed number 5

Clustering with 5 groups, seed number 1

Clustering with 5 groups, seed number 2

Clustering with 5 groups, seed number 3

Clustering with 5 groups, seed number 4

Clustering with 5 groups, seed number 5

Clustering with 6 groups, seed number 1

Clustering with 6 groups, seed number 2

Clustering with 6 groups, seed number 3

Clustering with 6 groups, seed number 4

Clustering with 6 groups, seed number 5

Clustering complete!

Determining seeds with highest KLD...

Best 1 groups, seed 4

Best 2 groups, seed 4

Best 3 groups, seed 4

Best 4 groups, seed 2

Best 5 groups, seed 4

Best 6 groups, seed 1

Parsing result files...

RESULTS for 1 CLUSTERS

1 Final Average KLD: 15.443280

1 1 970 15.443

Outliers: 26

#

RESULTS for 2 CLUSTERS

2 Final Average KLD: 12.623349

2 1 649 13.232

2 2 326 11.411

Outliers: 21

#

RESULTS for 3 CLUSTERS

3 Final Average KLD: 11.338390

3 1 555 12.788

3 2 129 10.354

3 3 293 9.026

Outliers: 19

#

RESULTS for 4 CLUSTERS

4 Final Average KLD: 10.575361

4 1 467 10.666

4 2 230 11.296

4 3 277 10.127

4 4 20 6.381

Outliers: 2

#

RESULTS for 5 CLUSTERS

5 Final Average KLD: 10.758199

5 1 221 11.146

5 2 267 10.716

5 3 17 5.970

5 4 469 10.902

5 5 20 7.751

Outliers: 2

#

RESULTS for 6 CLUSTERS

6 Final Average KLD: 10.101165

6 1 443 10.835

6 2 185 8.392

6 3 18 5.896

6 4 0 0.000

6 5 120 10.438

6 6 226 10.218

Outliers: 4

# Runing B.TrainingSet_Preparation.sh 15.44328 12.623349 11.33839 10.575361 10.758199 10.101165 Runing C.Motif_Learning.sh Runing D.SearchSpace_Filtering.sh No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. Mon Jan 10 15:32:19 GMT 2022

I hope this is easy to resolve, can you please advise?

Regards,

Pete

peterthorpe5 commented 2 years ago

Dear Introspect, have you managed to reproduce the error running your tests, can you please advise how to resolve this?

BGI2016 commented 2 years ago

Dear IntroSpect, I am running the test scripts, It has failed on (all steps up to this point ran ok. : Runing D.SearchSpace_Filtering.sh

with the following errors:

No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29.

FULL OUTPUT:

sh ./analysis/0.shell/RUNALL.sh Mon Jan 10 15:29:13 GMT 2022 Runing A.Peptides_Cluster.sh

Call: /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/gibbscluster-2.0/GibbsCluster-2.0e_SA.pl -H /shelf/apps/pjt6/conda/envs/trinity/bin/R -f /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/./analysis/1.cluser//test-1000.9-11mer.txt -g 1-6 -P test-1000.9-11mer -R /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/./analysis/1.cluser//test-1000.9-11mer -C -T -j 2 -l 9 -S 5 -I 0 -D 2 -k 2

Mon Jan 10 15:29:13 2022

Session ID: 1637

Run name: test-1000.9-11mer_1637

Read 996 unique sequences

Settings:

No shift moves, cluster move at every iteration #Number of clusters: 1 - 6 #Motif length: 9 #Initial MC temperture: 1.5 #Number of temperature steps: 20 #Number of iterations x Sequence x Tstep: 10 #Max insertion length: 0 #Max deletion length: 2 #Interval between Indel moves: 10 #Number of initial seeds: 5 #Penalty lambda: 0.8 #Weight on small clusters: 5 #Sequence weighting type: 0 #Use trash cluster to remove outliers: 1 #Threshold for trash cluster: 2

Running Gibbs clustering...

Clustering with 1 groups, seed number 1 #Clustering with 1 groups, seed number 2 #Clustering with 1 groups, seed number 3 #Clustering with 1 groups, seed number 4 #Clustering with 1 groups, seed number 5 #Clustering with 2 groups, seed number 1 #Clustering with 2 groups, seed number 2 #Clustering with 2 groups, seed number 3 #Clustering with 2 groups, seed number 4 #Clustering with 2 groups, seed number 5 #Clustering with 3 groups, seed number 1 #Clustering with 3 groups, seed number 2 #Clustering with 3 groups, seed number 3 #Clustering with 3 groups, seed number 4 #Clustering with 3 groups, seed number 5 #Clustering with 4 groups, seed number 1 #Clustering with 4 groups, seed number 2 #Clustering with 4 groups, seed number 3 #Clustering with 4 groups, seed number 4 #Clustering with 4 groups, seed number 5 #Clustering with 5 groups, seed number 1 #Clustering with 5 groups, seed number 2 #Clustering with 5 groups, seed number 3 #Clustering with 5 groups, seed number 4 #Clustering with 5 groups, seed number 5 #Clustering with 6 groups, seed number 1 #Clustering with 6 groups, seed number 2 #Clustering with 6 groups, seed number 3 #Clustering with 6 groups, seed number 4 #Clustering with 6 groups, seed number 5 #Clustering complete!

Determining seeds with highest KLD...

Best 1 groups, seed 4 #Best 2 groups, seed 4 #Best 3 groups, seed 4 #Best 4 groups, seed 2 #Best 5 groups, seed 4 #Best 6 groups, seed 1 ###Parsing result files... #RESULTS for 1 CLUSTERS #1 Final Average KLD: 15.443280 #1 1 970 15.443 #Outliers: 26

RESULTS for 2 CLUSTERS #2 Final Average KLD: 12.623349 #2 1 649 13.232 #2 2 326 11.411 #Outliers: 21

RESULTS for 3 CLUSTERS #3 Final Average KLD: 11.338390 #3 1 555 12.788 #3 2 129 10.354 #3 3 293 9.026 #Outliers: 19

RESULTS for 4 CLUSTERS #4 Final Average KLD: 10.575361 #4 1 467 10.666 #4 2 230 11.296 #4 3 277 10.127 #4 4 20 6.381 #Outliers: 2

RESULTS for 5 CLUSTERS #5 Final Average KLD: 10.758199 #5 1 221 11.146 #5 2 267 10.716 #5 3 17 5.970 #5 4 469 10.902 #5 5 20 7.751 #Outliers: 2

RESULTS for 6 CLUSTERS #6 Final Average KLD: 10.101165 #6 1 443 10.835 #6 2 185 8.392 #6 3 18 5.896 #6 4 0 0.000 #6 5 120 10.438 #6 6 226 10.218 #Outliers: 4

Runing B.TrainingSet_Preparation.sh 15.44328 12.623349 11.33839 10.575361 10.758199 10.101165 Runing C.Motif_Learning.sh Runing D.SearchSpace_Filtering.sh No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. No such file or directory at /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/TEST/../bin//E.Score_Calculation.pl line 29. Mon Jan 10 15:32:19 GMT 2022

I hope this is easy to resolve, can you please advise?

Regards,

Pete

Sorry, I've been on vacation recently. I see your email now and I will get back to you as soon as possible.

peterthorpe5 commented 2 years ago

any luck fixing this?

BGI2016 commented 2 years ago

I'm really sorry for the late reply. I didn't take my computer with me on vacation, so I can't help you answer your questions.

Now that I have finished my vacation, I think the following steps can help you solve the problem.

1.vi ./IntroSpect-1.0/TEST/test-1000/test-1000.searchSpace.ss Replace all "/ldfssz1/ST_PRECISION/USER/zhangle2/soft/IntroSpect-1.0" with your own "IntroSpect-1.0" folder path, to ensure that each line of the file is accessible.

  1. Re-run the peipeline. You can try re-run only the last step, sh D.SearchSpace_Filter.sh.

Because of my negligence, I copied the absolute path of my own workstation to the searchSpace.ss file, so the program could not find the searchSpace file it needed. Thank you very much for pointing out this problem.

It's also important to note that, my test script is just for testing the pipeline, so the search space files (such as ./introspect-1.0/TEST/test-1000/test-1000.searchSpace.ss) provided in the TEST folder are just some small and random sequences, which have no biological significance.

If you plan to use IntroSpect on your real sample, you should first use your protein database (such as Uniprot.human.protein.fasta) to generate your own searchSpace.ss file, dnd use your searchSpace.ss file path as the -ss parameter when running the IntroSpect.pl.

Example of generating an searchSpace.ss file and run IntroSpect: perl DB2SS.pl - file Uniprot.Human Protein.fasta -maxLen 15 -minLen 8 -outdir ./ -prefix peterthorpe5 perl IntroSpect.pl -ss peterthorpe5.searchSpace.ss -pep ./pep.txt -prefix peterthorpe5 -GC /storage/home/users/pjt6/IntroSpect/IntroSpect-1.0/gibbscluster-2.0/gibbscluster