Closed 24natasya closed 4 years ago
Which version of Octopus are you using?
octopus v0.6.3-beta (develop 7eab0cdd)
On Tue, 22 Oct 2019 at 8:33 PM, Daniel Cooke notifications@github.com wrote:
Which version of Octopus are you using?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luntergroup/octopus/issues/89?email_source=notifications&email_token=AMYUHWLTAK2AZQBO3Y2GS73QP3XI7A5CNFSM4JDM6NY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB5R6YY#issuecomment-544939875, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMYUHWJE3SJM6ZHBBYOFIPTQP3XI7ANCNFSM4JDM6NYQ .
Could you please try the latest development version? Note that this comes with a new version of train_random_forest.py
that has a different interface to the one you're currently using. I have yet to update the documentation for this, but you should be able to replace your command with:
$ ./train_random_forest.py \
--config config.json \
--octopus /tmp/octopus/bin/octopus \
--rtg /export/home/natasya/rtg-tools-3.10.1/rtg \
--ranger /home/ranger/ranger/cpp_version/build/ranger \
--prefix NA12878.wgs \
-o /export/home/natasya/forests/NA12878_n \
--threads 10
where config.json
contains:
{
"truths": {
"GRCh37.HG001": {
"vcf": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz",
"bed": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_nosomaticdel.bed"
}
},
"examples": [
{
"reference": "/export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.fa",
"reads": "/export/Projects/2019_MLVarCaller/01_BAMs/novoalignV4/GIAB_NA12878_GRCh37D_novoalignV4.sort.bam",
"calling_regions": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/GRCh37_nexterarapidcapture_expandedexome_targetedregions.bed",
"truth": "GRCh37.HG001"
}
],
"training": {
"hyperparameters": [
{
"trees": 300,
"min_node_size": 20
}
]
}
}
Ok, I will try this and see if it works
On Wed, 23 Oct 2019 at 7:18 PM, Daniel Cooke notifications@github.com wrote:
Could you please try the latest development version? Note that this comes with a new version of train_random_forest.py that has a different interface to the one you're currently using. I have yet to update the documentation for this, but you should be able to replace your command with:
$ ./train_random_forest.py \ --config config.json \ --octopus /tmp/octopus/bin/octopus \ --rtg /export/home/natasya/rtg-tools-3.10.1/rtg \ --ranger /home/ranger/ranger/cpp_version/build/ranger \ --prefix NA12878.wgs \ -o /export/home/natasya/forests/NA12878_n \ --threads 10
where config.json contains:
{ "truths": { "GRCh37.HG001": { "vcf": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz", "bed": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_nosomaticdel.bed" } }, "examples": [ { "reference": "/export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.fa", "reads": "/export/Projects/2019_MLVarCaller/01_BAMs/novoalignV4/GIAB_NA12878_GRCh37D_novoalignV4.sort.bam", "calling_regions": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/GRCh37_nexterarapidcapture_expandedexome_targetedregions.bed", "truth": "GRCh37.HG001" } ], "training": { "hyperparameters": [ { "trees": 300, "min_node_size": 20 } ] } }
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luntergroup/octopus/issues/89?email_source=notifications&email_token=AMYUHWIUZCECJZ4OOHPEV73QQAXHPA5CNFSM4JDM6NY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECBBDWI#issuecomment-545395161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMYUHWPU3XUVWPHM2AGXX7DQQAXHPANCNFSM4JDM6NYQ .
I run the training and it works with the new development. Thank you
On Wed, Oct 23, 2019 at 9:08 PM Natasya Umairah Bt Mohd Omeershffudin < natasya@novocraft.com> wrote:
Ok, I will try this and see if it works
On Wed, 23 Oct 2019 at 7:18 PM, Daniel Cooke notifications@github.com wrote:
Could you please try the latest development version? Note that this comes with a new version of train_random_forest.py that has a different interface to the one you're currently using. I have yet to update the documentation for this, but you should be able to replace your command with:
$ ./train_random_forest.py \ --config config.json \ --octopus /tmp/octopus/bin/octopus \ --rtg /export/home/natasya/rtg-tools-3.10.1/rtg \ --ranger /home/ranger/ranger/cpp_version/build/ranger \ --prefix NA12878.wgs \ -o /export/home/natasya/forests/NA12878_n \ --threads 10
where config.json contains:
{ "truths": { "GRCh37.HG001": { "vcf": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz", "bed": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_nosomaticdel.bed" } }, "examples": [ { "reference": "/export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.fa", "reads": "/export/Projects/2019_MLVarCaller/01_BAMs/novoalignV4/GIAB_NA12878_GRCh37D_novoalignV4.sort.bam", "calling_regions": "/export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/GRCh37_nexterarapidcapture_expandedexome_targetedregions.bed", "truth": "GRCh37.HG001" } ], "training": { "hyperparameters": [ { "trees": 300, "min_node_size": 20 } ] } }
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luntergroup/octopus/issues/89?email_source=notifications&email_token=AMYUHWIUZCECJZ4OOHPEV73QQAXHPA5CNFSM4JDM6NY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECBBDWI#issuecomment-545395161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMYUHWPU3XUVWPHM2AGXX7DQQAXHPANCNFSM4JDM6NYQ .
Hi! I have been having this problem when training random forest numerous time
[2019-10-22 09:17:41] An unclassified error has occurred:
[2019-10-22 09:17:41]
[2019-10-22 09:17:41] _Map_base::at.
[2019-10-22 09:17:41]
[2019-10-22 09:17:41] To help resolve this error submit an error report.
[2019-10-22 09:17:41] Encountered error in task writer thread. Calling terminate
terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
Error: The --calls file
"/export/home/natasya/forests/NA12878_n/octopus.GIAB_NA12878_GRCh37D_novoalignV4.sort.hs37d5.fa.legacy.vcf.gz"
does not exist.
Usage: rtg vcfeval [OPTION]... -b FILE -c FILE -o DIR -t SDF
Try '--help' for more information [E::hts_open_format] fail to open file '/export/home/natasya/forests/NA12878_n/octopus.GIAB_NA12878_GRCh37D_novoalignV4.sort.hs37d5.fa.eval/tp.vcf.gz' Failed to open /export/home/natasya/forests/NA12878_n/octopus.GIAB_NA12878_GRCh37D_novoalignV4.sort.hs37d5.fa.eval/tp.vcf.gz: No such file or directory [E::hts_open_format] Failed to open file /export/home/natasya/forests/NA12878_n/octopus.GIAB_NA12878_GRCh37D_novoalignV4.sort.hs37d5.fa.eval/tp.train.vcf.gz Traceback (most recent call last): File "/tmp/octopus/scripts/train_random_forest.py", line 202, in
main(parsed)
File "/tmp/octopus/scripts/train_random_forest.py", line 114, in main
make_ranger_data(tp_train_vcf_path, tp_data_path, True, default_measures, options.missing_value)
File "/tmp/octopus/scripts/train_random_forest.py", line 68, in make_ranger_data
vcf = VariantFile(octopus_vcf_path)
File "pysam/libcbcf.pyx", line 4017, in pysam.libcbcf.VariantFile.init
File "pysam/libcbcf.pyx", line 4238, in pysam.libcbcf.VariantFile.open
FileNotFoundError: [Errno 2] could not open variant file
b'/export/home/natasya/forests/NA12878_n/octopus.GIAB_NA12878_GRCh37D_novoalignV4.sort.hs37d5.fa.eval/tp.train.vcf.gz'
: No such file or directoryThe command lines i used is as below : ./train_random_forest.py -R /export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.fa -I /export/Projects/2019_MLVarCaller/01_BAMs/novoalignV4/GIAB_NA12878_GRCh37D_novoalignV4.sort.bam -T /export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/GRCh37_nexterarapidcapture_expandedexome_targetedregions.bed --truth /export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz --confident /export/Projects/2019_MLVarCaller/02_TruthSets/GIAB_NA12878/HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_nosomaticdel.bed --octopus /tmp/octopus/bin/octopus --rtg /export/home/natasya/rtg-tools-3.10.1/rtg --sdf /export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.sdf --ranger /home/ranger/ranger/cpp_version/build/ranger --trees 300 --min_node_size 20 --missing_value -1 --prefix NA12878.wgs -o /export/home/natasya/forests/NA12878_n/ --threads 10 >
What possible error could it be? I ran the same command line on same sample but different aligner there seems to be no error.