luntergroup / octopus

Bayesian haplotype-based mutation calling
MIT License
302 stars 38 forks source link

Converting CSV file from classified calls #82

Closed 24natasya closed 4 years ago

24natasya commented 5 years ago

Hi ! I am training forest but I am unable to convert the CSV file from the classified calls csv. Can you provide a detail example to convert the csv file?

dancooke commented 5 years ago

Have you tried using the bundled Python script? If you provide details of your training data then I can advise how to use the script.

24natasya commented 5 years ago

Hi, yes appreciate if you could advise me on that. I am currently doing training on sample : NA12878 for germline analysis.

my reference fasta is : hs37d5 Im using truth set from GIAB (NA12878) HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_nosomaticdel.bed HG001_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz GRCh37_nexterarapidcapture_expandedexome_targetedregions.bed

On Thu, Sep 5, 2019 at 6:38 PM Daniel Cooke notifications@github.com wrote:

Have you tried using the bundled Python script https://github.com/luntergroup/octopus/blob/develop/scripts/train_random_forest.py? If you provide details of your training data then I can advise how to use the script.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luntergroup/octopus/issues/82?email_source=notifications&email_token=AMYUHWJRWYU6JAOYKFYY4J3QIDOSDA5CNFSM4ITY2AJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD56UYMY#issuecomment-528305203, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYUHWPNGHC46MLPIDT7VG3QIDOSDANCNFSM4ITY2AJA .

dancooke commented 5 years ago

Ok, and what is your input read data, just a single BAM file?

24natasya commented 4 years ago

Yes just a single BAM file

On Mon, Sep 9, 2019 at 7:30 PM Daniel Cooke notifications@github.com wrote:

Ok, and what is your input read data, just a single BAM file?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luntergroup/octopus/issues/82?email_source=notifications&email_token=AMYUHWLQRNNWHPFOW3LI7A3QIYXV5A5CNFSM4ITY2AJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6HG42A#issuecomment-529428072, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYUHWIOA7M3WGAFAMVG7QLQIYXV5ANCNFSM4ITY2AJA .

dancooke commented 4 years ago

In that case you can just use the bundled Python script to train the forest. See here for an example.

dancooke commented 4 years ago

Closing as this is already documented.