Closed Leehyeonjin93 closed 6 years ago
sorry, I sorted all file but I didn't get final file.
I think that isn't sorting problem.
perhaps, Can't vep.vcf,readcount and fpkm_tracking files get letter "chr" in position column?
For the bam-readcount parsing to work, the chromosome names need to match between the VCF and the bam-readcount files. However, sorting is not required.
When you say that you "didn't get a final file", did the pVACseq run error out or does the final file not contain any predicted epitopes? If it errors out, please post the error. If the file is empty (except for the header line) then all your neoeptiopes got filtered out. You can see the unfiltered list of neoepitopes in the <sample_name>.combined.parsed.tsv
file.
thanks, but I didn't get error messages.
so, I checked
so, I checked input files. (readcount file) (vep.VCF file) (cufflinks result file)
but, I don't know cause of problem.
What does your pvacseq command look like? Are you providing a --additional-input-file-list
?
yes, this is my running command
and this is my input log.
thank you.
Hm, I don't see anything obviously wrong. Can you attach the actual data files or email them to our help alias (help@pvactools.org) so that I can run this data through the debugger? It would also be sufficient to send me shortened data files that only contain the entries for variant chr19 58864361, since that is a case we expect to have readcounts attached.
I received your files and ran them locally and am unable to replicate the problem. My additional_input_file list looks like this:
$ cat additional_input_file_list.yaml
gene_expn_file: ch19.genes.tracking
transcript_expn_file: ch19.isoforms.tracking
tdna_snvs_coverage_file: BC16158_T_DNA_bam.snp.Somatic.chr19.sorted.readcount
normal_snvs_coverage_file: BC16158_N_DNA_bam.snp.Somatic.chr19.sorted.readcount
trna_snvs_coverage_file: BC16158_T_RNA_bam.snp.Somatic.chr19.sorted.readcount
The command I ran was:
pvacseq run -e 9 -i additional_input_file_list.yaml BC16158_WES_alignmac.varscan2.snp.Somatic.ch19.vep.vcf BC16158 HLA-A*30:01 NetMHC BC16158
My results look like this. It shows that the vaf, coverage, and expression was parsed correctly for at least some of the samples. The rest were positions that were not in the additional input files:
$ cut -f1-3,15-22 BC16158/MHC_Class_I/BC16158.tsv
chromosome_name start stop transcript_expression gene_expression normal_depth normal_vaf tdna_depth tdna_vaf trna_depth trna_vaf
chr19 13879198 13879199 6.22641 16.7291 219 0.0 294 31.292515942431432 161 54.65838169823716
chr19 13879198 13879199 2.60674 16.7291 219 0.0 294 31.292515942431432 161 54.65838169823716
chr19 14877815 14877816 0.410364 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.154975 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.199177 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 4.55976e-20 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.0689099 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 1.01028e-19 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.179855 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.0819899 2.59447 NA NA NA NA NA NA
chr19 14877815 14877816 0.111585 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.410364 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.154975 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.199177 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 4.55976e-20 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.0689099 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 1.01028e-19 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.179855 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.0819899 2.59447 NA NA NA NA NA NA
chr19 14877819 14877820 0.111585 2.59447 NA NA NA NA NA NA
chr19 20727618 20727619 0.0941472 0.0941614 NA NA NA NA NA NA
chr19 20727619 20727620 0.0941472 0.0941614 NA NA NA NA NA NA
chr19 43579556 43579557 0 0.0 98 0.0 275 33.818180588429804 NA NA
chr19 43579556 43579557 0 0.0 98 0.0 275 33.818180588429804 NA NA
chr19 58864361 58864362 9.01375e-14 7.47729 89 0.0 181 38.12154485516327 8 62.49992187509766
chr19 58864361 58864362 9.72373e-52 7.47729 89 0.0 181 38.12154485516327 8 62.49992187509766
In the combined.parsed.file when I look for the last variant:
$ grep -h -E 'start|58864361' BC16158/MHC_Class_I/BC16158.combined.parsed.tsv | cut -f1-3,22-29 | head
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.01375e-14
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.72373e-52
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.72373e-52
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.01375e-14
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.72373e-52
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.01375e-14
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.72373e-52
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.01375e-14
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.72373e-52
chr19 58864361 58864362 181 38.12154485516327 8 62.49992187509766 89 0.0 7.47729 9.01375e-14
When you ran your pvacseq command the last time, was the output directory empty?
oh... thank you.!! I knew what was problem. in my case, on additional_input_file_list.yaml, two lists were changed each other. like this, gene_expn_file: ch19.genes.tracking transcript_expn_file: ch19.isoforms.tracking normal_snvs_coverage_file: BC16158_N_DNA_bam.snp.Somatic.chr19.sorted.readcount tdna_snvs_coverage_file: BC16158_T_DNA_bam.snp.Somatic.chr19.sorted.readcount trna_snvs_coverage_file: BC16158_T_RNA_bam.snp.Somatic.chr19.sorted.readcount
so, I followed your command. then, removed all previous tsv file and tmp file.
finally, I got final files. thank you!
Hi, I have a problem about inputfile.
I have vep.vcf file that include chromosome position with "chr". but my readcount file have not "chr" letter, DNA(normal,tumor),RNA all of them.
so, I altered readcount file and that included "chr" letter.
but, coverage file was empty. vep.vcf, cufflinks results and readcount file must be sorted? can I use altered readcount file with vep.vcf file?