griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
141 stars 59 forks source link

Fasta is empty #88

Closed yang-yangfeng closed 5 years ago

yang-yangfeng commented 6 years ago

working directory: /gscmnt/gc2547/griffithlab/yafeng/PRAD

command: pvacfuse run --net-chop-method cterm --netmhc-stab --iedb-install-directory /gscmnt/gc2502/griffithlab/yafeng -e 8,9,10,11 TCGA-EJ-8474-01/fusion_antigen_out/TCGA-EJ-8474-01.bedpe.annot sample HLA-A*68:01,HLA-B*58:01,HLA-C*12:03 NNalign NetMHC NetMHCIIpan NetMHCcons NetMHCpan PickPocket SMM SMMPMBEC SMMalign TCGA-EJ-8474-01/pvacfuse_output


Converting .bedpe to TSV
Completed
Splitting TSV into smaller chunks
Splitting TSV into smaller chunks - Entries 1-13
Completed
Generating Variant Peptide FASTA and Key Files
Generating Variant Peptide FASTA and Key Files - Entries 1-26
Wildtype sequence length is shorter than desired peptide sequence length at position (10 / 4, 59352269 / 6696892, -1 / -1). Using wildtype sequence length (9) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (17 / 19, 16171802 / 42651086, -1 / -1). Using wildtype sequence length (1) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (10 / 1, -1 / -1, 102594065 / 23408726). Using wildtype sequence length (6) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (13 / 4, 95161188 / 82904735, -1 / -1). Using wildtype sequence length (1) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (5 / 5, 33998640 / 61073124, -1 / -1). Using wildtype sequence length (19) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (5 / 5, 33998640 / 61098991, -1 / -1). Using wildtype sequence length (19) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (5 / 5, 33998640 / 61152703, -1 / -1). Using wildtype sequence length (19) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (5 / 5, 33998640 / 61073124, -1 / -1). Using wildtype sequence length (19) instead.
Wildtype sequence length is shorter than desired peptide sequence length at position (5 / 5, 33998640 / 61098991, -1 / -1). Using wildtype sequence length (19) instead.
Completed
Processing entries for Allele HLA-A*68:01 and Epitope Length 8 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-A*68:01 and Epitope Length 9 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-A*68:01 and Epitope Length 10 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-A*68:01 and Epitope Length 11 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-B*58:01 and Epitope Length 8 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-B*58:01 and Epitope Length 9 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-B*58:01 and Epitope Length 10 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-B*58:01 and Epitope Length 11 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-C*12:03 and Epitope Length 8 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-C*12:03 and Epitope Length 9 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-C*12:03 and Epitope Length 10 - Entries 1-26
Fasta file is empty. Skipping
Processing entries for Allele HLA-C*12:03 and Epitope Length 11 - Entries 1-26
Fasta file is empty. Skipping
No output files were created. Aborting.```
susannasiebert commented 6 years ago

I'm going line by line in /gscmnt/gc2547/griffithlab/yafeng/PRAD/TCGA-EJ-8474-01/pvacfuse_output/MHC_Class_I/sample.tsv to explain why each variant was filtered out: FAM13C>>S100P_1.inframe_fusion.68: position is 68 which is out of bounds of the sequence NCOR1>>LIPE-AS1_1.inframe_fusion.64: sequence is only X TCEA3>>PRR16_1.inframe_fusion.46: position is 46 which is out of bounds of the sequence SRSF4>>RSBN1L_1.inframe_fusion.37: position is 37 which is out of bounds of the sequence SUFU>>TCEA3_1.inframe_fusion.24: position is 24 which is out of bounds of the sequence ABCC4>>THAP9_1.inframe_fusion.48: sequence is only X AMACR>>NDUFAF2_1.inframe_fusion.62: position is 62 which is out of bounds of the sequence AMACR>>NDUFAF2_1.inframe_fusion.62: position is 62 which is out of bounds of the sequence AMACR>>NDUFAF2_1.inframe_fusion.62: position is 62 which is out of bounds of the sequence NDUFAF2>>AMACR_1.inframe_fusion.57: position is 57 which is out of bounds of the sequence C1QTNF3-AMACR>>NDUFAF2_1.inframe_fusion.62: position is 62 which is out of bounds of the sequence C1QTNF3-AMACR>>NDUFAF2_1.inframe_fusion.62: position is 62 which is out of bounds of the sequence NDUFAF2>>C1QTNF3-AMACR_1.inframe_fusion.57: position is 57 which is out of bounds of the sequence

yang-yangfeng commented 6 years ago

Hmm so I guess it could just be coincidence that this sample didn't produce any novel peptides?

susannasiebert commented 6 years ago

Well, it's strange because the position is supposed to denote where in the full fusion sequence the fusion position is which we then use as a midpoint for determining the shorter fasta sequence that gets fed to IEDB. But in your file the positions are all after the end of the full fusion sequence. So I'm wondering if pVACfuse is interpreting the position incorrectly. This would be something to check with Jin.

susannasiebert commented 6 years ago

@yang-yangfeng any progress on working with Jin on this?

susannasiebert commented 5 years ago

@yang-yangfeng bumping this issue. Did this ever get resolved?