wenbostar / Customprodbj

Customized protein database construction
GNU General Public License v3.0
7 stars 4 forks source link

java.lang.NullPointerException #1

Open wenyuhaokikika opened 1 year ago

wenyuhaokikika commented 1 year ago

Thanks to Customprodbj, it is a great project, there is no better package yet.

but I encountered problems during the operation, as follows

OS

└─(21:09:14)──> cat /etc/redhat-release                                                                                         1 ↵ ──(Tue,Mar14)─┘
CentOS Linux release 7.9.2009 (Core)

environment

└─(21:05:09)──> java -version                                                                                                   1 ↵ ──(Tue,Mar14)─┘
openjdk version "1.8.0_152-release"
OpenJDK Runtime Environment (build 1.8.0_152-release-1056-b12)
OpenJDK 64-Bit Server VM (build 25.152-b12, mixed mode)

└─(21:08:09)──> perl -v                                                                                                             ──(Tue,Mar14)─┘

This is perl 5, version 32, subversion 1 (v5.32.1) built for x86_64-linux-thread-multi

filst i download annovar

#download annovar.latest.tar.gz 
gunzip annovar.latest.tar.gz
wget https://github.com/wenbostar/Customprodbj/releases/download/v1.2.0/customprodbj-1.2.0.jar
ls
>>> annovar  annovar.latest.tar.gz  customprodbj-1.2.0.jar

then i run

perl annovar/table_annovar.pl annovar/example/ex2.vcf annovar/humandb \
-buildver hg19 -out out/test -protocol refGene -operation g -nastring . \
-vcfinput --thread 30 --maxgenethread 30 -polish

it is ok, next i run

java -jar customprodbj-1.2.0.jar -i out/test.hg19_multianno.vcf \
-d annovar/humandb/hg19_refGeneMrna.fa \
-r annovar/humandb/hg19_refGene.txt -t \
-o out/

it raise NullPointerException:

└─(21:09:18)──> java -jar customprodbj-1.2.0.jar -i out/test.hg19_multianno.vcf \                                                   ──(Tue,Mar14)─┘ 
-d annovar/humandb/hg19_refGeneMrna.fa \
-r annovar/humandb/hg19_refGene.txt -t \
-o out/
log file:out//log.txt
Read mRNA sequences: 72119
Used mRNA sequences: out//mRNA_seq.fasta
Remove transcripts: 1562
Exception in thread "main" java.lang.NullPointerException
        at main.java.DbCreator.run(DbCreator.java:490)
        at main.java.DbCreator.main(DbCreator.java:206)

please tell me how to fix it.

think you! good luck for you

wenbostar commented 1 year ago

Could you share with me your input files for customprodbj?

wenyuhaokikika commented 1 year ago

i use command

java -jar customprodbj-1.2.0.jar -i test.hg19_multianno.vcf \
-d annovar/humandb/hg19_refGeneMrna.fa \
-r annovar/humandb/hg19_refGene.txt -t \
-o out/

test.hg19_multianno.vcf

##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
##INFO=<ID=ANNOVAR_DATE,Number=1,Type=String,Description="Flag the start of ANNOVAR annotation for one alternative allele">
##INFO=<ID=Func.refGene,Number=.,Type=String,Description="Func.refGene annotation provided by ANNOVAR">
##INFO=<ID=Gene.refGene,Number=.,Type=String,Description="Gene.refGene annotation provided by ANNOVAR">
##INFO=<ID=GeneDetail.refGene,Number=.,Type=String,Description="GeneDetail.refGene annotation provided by ANNOVAR">
##INFO=<ID=ExonicFunc.refGene,Number=.,Type=String,Description="ExonicFunc.refGene annotation provided by ANNOVAR">
##INFO=<ID=AAChange.refGene,Number=.,Type=String,Description="AAChange.refGene annotation provided by ANNOVAR">
##INFO=<ID=ALLELE_END,Number=0,Type=Flag,Description="Flag the end of ANNOVAR annotation for one alternative allele">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NA00001 NA00002 NA00003
16  50745926    rs2066844   C   T   80  PASS    NS=3;DP=14;AF=0.5;DB;H2;ANNOVAR_DATE=2020-06-08;Func.refGene=exonic;Gene.refGene=NOD2;GeneDetail.refGene=.;ExonicFunc.refGene=nonsynonymous_SNV;AAChange.refGene=NOD2:NM_001293557:exon3:c.C2023T:p.R675W,NOD2:NM_001370466:exon4:c.C2023T:p.R675W,NOD2:NM_022162:exon4:c.C2104T:p.R702W;ALLELE_END GT:GQ:DP:HQ 0|0:48:1:51,51  1|0:48:8:51,51  1/1:43:5:.,.
20  14370   rs6054257   G   A   29  PASS    NS=3;DP=14;AF=0.5;DB;H2;ANNOVAR_DATE=2020-06-08;Func.refGene=intergenic;Gene.refGene=NONE\x3bDEFB125;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d53943;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END    GT:GQ:DP:HQ 0|0:48:1:51,51  1|0:48:8:51,51  1/1:43:5:.,.
20  17330   .   T   A   3   q10 NS=3;DP=11;AF=0.017;ANNOVAR_DATE=2020-06-08;Func.refGene=intergenic;Gene.refGene=NONE\x3bDEFB125;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d50983;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END    GT:GQ:DP:HQ 0|0:49:3:58,50  0|1:3:5:65,3    0/0:41:3
20  1110696 rs6040355   A   G,T 67  PASS    NS=2;DP=10;AF=0.333,0.667;AA=T;DB;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=PSMF1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=PSMF1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END GT:GQ:DP:HQ 1|2:21:6:23,27  2|1:2:0:18,2    2/2:35:4
20  1230237 .   T   G   47  PASS    NS=3;DP=13;AA=T;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=RAD21L1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END  GT:GQ:DP:HQ 0|0:54:7:56,60  0|0:48:4:51,51  0/0:61:2
20  1230288 .   T   .   50  PASS    NS=3;DP=13;AA=T;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=RAD21L1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END  GT:GQ:DP:HQ 0|0:54:7:56,60  0|0:48:4:51,51  0/0:61:2
20  1234567 microsat1   GTCT    G,GTACT 50  PASS    NS=3;DP=9;AA=G;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=RAD21L1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END;ANNOVAR_DATE=2020-06-08;Func.refGene=intronic;Gene.refGene=RAD21L1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;ALLELE_END    GT:GQ:DP    0/1:35:4    0/2:17:2    1/1:40:3

annovar file from http://www.openbioinformatics.org/annovar/download/0wgxR2rIVP/annovar.latest.tar.gz

annovar/humandb/hg19_refGeneMrna.fa

>NR_148357   Comment: this sequence (leftmost exon at chr1:11868) is generated by ANNOVAR on Sat Feb 29 11:33:29 2020, based on regions specified in
 humandb/hg19_refGene.txt and sequence files stored at humandb/hg19_seq.
GTTAACTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCAGACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCAGGTGTCTG
ACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAG
GCATAGGGGAAAGATTGGAGGAAAGATGAGTGAGAGCATCAACTTCTCTCACAACCTAGGCCAGTGTGTGGTGATGCCAGGCATGCCCTTCCCCAGCATCAGGTCTCCAGAGCTGCAGAAGACGACGGCCGACTTGGATCACACTCTT
GTGAGTGTCCCCAGTGTTGCAGAGGCAGGGCCATCAGGCACCAAAGGGATTCTGCCAGCATAGTGCTCCTGGACCAGTGATACACCCGGCACCCTGTCCTGGACACGCTGTTGGCCTGGATCTGAGCCCTGGTGGAGGTCAAAGCCAC
CTTTGGTTCTGCCATTGCTGCTGTGTGGAAGTTCACTCCTGCCTTTTCCTTTCCCTAGAGCCTCCACCACCCCGAGATCACATTTCTCACTGCCTTTTGTCTGCCCAGTTTCACCAGAAGTAGGCCTCTTCCTGACAGGCAGCTGCAC
CACTGCCTGGCGCTGTGCCCTTCCTTTGCTCTGCCCGCTGGAGACGGTGTTTGTCATGGGCCTGGTCTGCAGGGATCCTGCTACAAAGGTGAAACCCAGGAGAGTGTGGAGTCCAGAGTGTTGCCAGGACCCAGGCACAGGCATTAGT
GCCCGTTGGAGAAAACAGGGGAATCCCGAAGAAATGGTGGGTCCTGGCCATCCGTGAGATCTTCCCAGGGCAGCTCCCCTCTGTGGAATCCAATCTGTCTTCCATCCTGCGTGGCCGAGGGCCAGGCTTCTCACTGGGCCTCTGCAGG
AGGCTGCCATTTGTCCTGCCCACCTTCTTAGAAGCGAGACGGAGCAGACCCATCTGCTACTGCCCTTTCTATAATAACTAAAGTTAGCTGCCCTGGACTATTCACCCCCTAGTCTCAATTTAAGAAGATCCCCATGGCCACAGGGCCC
CTGCCTGGGGGCTTGTCACCTCCCCCACCTTCTTCCTGAGTCATTCCTGCAGCCTTGCTCCCTAACCTGCCCCACAGCCTTGCCTGGATTTCTATCTCCCTGGCTTGGTGCCAGTTCCTCCAAGTCGATGGCACCTCCCTCCCTCTCA
ACCACTTGAGCAAACTCCAAGACATCTTCTACCCCAACACCAGCAATTGTGCCAAGGGCCATTAGGCTCTCAGCATGACTATTTTTAGAGACCCCGTGTCTGTCACTGAAACCTTTTTTGTGGGAGACTATTCCTCCCATCTGCAACA
GCTGCCCCTGCTGACTGCCCTTCTCTCCTCCCTCTCATCCCAGAGAAACAGGTCAGCTGGGAGCTTCTGCCCCCACTGCCTAGGGACCAACAGGGGCAGGAGGCAGTCACTGACCCCGAGACGTTTGCAT
>NR_046018   Comment: this sequence (leftmost exon at chr1:11873) is generated by ANNOVAR on Sat Feb 29 11:33:29 2020, based on regions specified in
 humandb/hg19_refGene.txt and sequence files stored at humandb/hg19_seq.
CTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCAGACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCAGGTGTCTGACTTC
CAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATA
GGGGAAAGATTGGAGGAAAGATGAGTGAGAGCATCAACTTCTCTCACAACCTAGGCCAGTGTGTGGTGATGCCAGGCATGCCCTTCCCCAGCATCAGGTCTCCAGAGCTGCAGAAGACGACGGCCGACTTGGATCACACTCTTGTGAG
TGTCCCCAGTGTTGCAGAGGCAGGGCCATCAGGCACCAAAGGGATTCTGCCAGCATAGTGCTCCTGGACCAGTGATACACCCGGCACCCTGTCCTGGACACGCTGTTGGCCTGGATCTGAGCCCTGGTGGAGGTCAAAGCCACCTTTG
GTTCTGCCATTGCTGCTGTGTGGAAGTTCACTCCTGCCTTTTCCTTTCCCTAGAGCCTCCACCACCCCGAGATCACATTTCTCACTGCCTTTTGTCTGCCCAGTTTCACCAGAAGTAGGCCTCTTCCTGACAGGCAGCTGCACCACTG
CCTGGCGCTGTGCCCTTCCTTTGCTCTGCCCGCTGGAGACGGTGTTTGTCATGGGCCTGGTCTGCAGGGATCCTGCTACAAAGGTGAAACCCAGGAGAGTGTGGAGTCCAGAGTGTTGCCAGGACCCAGGCACAGGCATTAGTGCCCG
TTGGAGAAAACAGGGGAATCCCGAAGAAATGGTGGGTCCTGGCCATCCGTGAGATCTTCCCAGGGCAGCTCCCCTCTGTGGAATCCAATCTGTCTTCCATCCTGCGTGGCCGAGGGCCAGGCTTCTCACTGGGCCTCTGCAGGAGGCT
GCCATTTGTCCTGCCCACCTTCTTAGAAGCGAGACGGAGCAGACCCATCTGCTACTGCCCTTTCTATAATAACTAAAGTTAGCTGCCCTGGACTATTCACCCCCTAGTCTCAATTTAAGAAGATCCCCATGGCCACAGGGCCCCTGCC
TGGGGGCTTGTCACCTCCCCCACCTTCTTCCTGAGTCATTCCTGCAGCCTTGCTCCCTAACCTGCCCCACAGCCTTGCCTGGATTTCTATCTCCCTGGCTTGGTGCCAGTTCCTCCAAGTCGATGGCACCTCCCTCCCTCTCAACCAC
...

annovar/humandb/hg19_refGene.txt

585     NR_148357       chr1    +       11868   14362   14362   14362   3       11868,12612,13220,      12227,12721,14362,      0       LOC102725121
        unk     unk     -1,-1,-1,
585     NR_046018       chr1    +       11873   14409   14409   14409   3       11873,12612,13220,      12227,12721,14409,      0       DDX11L1 unkunk      -1,-1,-1,
96      NM_015330       chr22   +       24666798        24813706        24698199        24810591        17      24666798,24672667,24698162,24709280,
24717255,24720187,24724813,24726223,24730377,24734353,24743053,24759228,24761443,24765185,24807555,24808615,24810501,   24666951,24672771,24698352,2
4709434,24718886,24720395,24724887,24726399,24730541,24734445,24743144,24759312,24761600,24765288,24807672,24808675,24813706,   0       SPECC1L cmpl
        cmpl    -1,-1,0,0,1,0,1,0,2,1,0,1,1,2,0,0,0,
585     NR_106918       chr1    -       17368   17436   17436   17436   1       17368,  17436,  0       MIR6859-1       unk     unk     -1,
585     NR_107062       chr1    -       17368   17436   17436   17436   1       17368,  17436,  0       MIR6859-2       unk     unk     -1,
585     NR_107063       chr1    -       17368   17436   17436   17436   1       17368,  17436,  0       MIR6859-3       unk     unk     -1,
585     NR_128720       chr1    -       17368   17436   17436   17436   1       17368,  17436,  0       MIR6859-4       unk     unk     -1,
585     NR_036051       chr1    +       30365   30503   30503   30503   1       30365,  30503,  0       MIR1302-2       unk     unk     -1,
585     NR_036266       chr1    +       30365   30503   30503   30503   1       30365,  30503,  0       MIR1302-9       unk     unk     -1,
585     NR_036267       chr1    +       30365   30503   30503   30503   1       30365,  30503,  0       MIR1302-10      unk     unk     -1,
585     NR_036268       chr1    +       30365   30503   30503   30503   1       30365,  30503,  0       MIR1302-11      unk     unk     -1,
585     NR_026818       chr1    -       34610   36081   36081   36081   3       34610,35276,35720,      35174,35481,36081,      0       FAM138A unkunk      -1,-1,-1,
585     NR_026820       chr1    -       34610   36081   36081   36081   3       34610,35276,35720,      35174,35481,36081,      0       FAM138F unkunk      -1,-1,-1,
585     NR_026822       chr1    -       34610   36081   36081   36081   3       34610,35276,35720,      35174,35481,36081,      0       FAM138C unku
...
wenbostar commented 1 year ago

It would be helpful if you can share the files with me so that I can reproduce the error for debugging.

wenyuhaokikika commented 1 year ago

three file is this:

https://drive.google.com/file/d/1M59sK54C6TWMMnk5-VgjkQhLswZylk0e/view?usp=sharing