Closed rfcohen closed 3 years ago
Name your custom gene-call file as: myGenome.custom.gff, where myGenome is the same as the output subdirectory name (which is best named also the same as the genome name in your genome's fasta file: myGenome.fasta). Place your custom gene-call file in the PipelineInput/ directory. The code will recognize the file by its name and move it to the corresponding output subdirectory. If you are running more than one genome, then you need a custom gene-call file for each genome. GFF format of the custom gene-call file is like that which Prodigal produces. Ignore the custom_gene_caller_name configuration parameter as it was not useful and I removed it from the sample configuration file.
Thanks. This helps a lot. Still have an error with the .cgc file.
[Errno 2] No such file or directory: '/home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs_test/custom.cgc' primary calls file, /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs_test/custom.cgc phate_sequenceAnnotation_main says, ERROR: Check the formats of your input file(s): genome file is /home/rcohen/Documents/multiPhATE2/PipelineInput/EcCH94Phi94_contigs.fasta primary gene call file is /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs_test/custom.cgc
There is no custom .cgc file created. Where does it come from?
Any guidance would be greatly appreciated.
Please post your custom gene-call file, or send to me via email, if you prefer.
Thx. Would be happy to email the gene file. What’s the best email address?
multiphate@gmail.com
Thank you. I emailed the files.
-Rob
On Jan 16, 2021, at 4:22 PM, Carol Zhou notifications@github.com wrote:
multiphate@gmail.com
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi Rob, Your custom gene-call file has 8 columns--should be 9. It is missing the "score" column, which occurs after the start/stop columns and before the strand (+/-).
Thank you. Will check it out.
But that doesn’t explain why the prodigal file produced the same error. If the input should be modeled like the prodigal file and I used the output of prodigal as the input as a test, it should have worked?
-Rob
On Jan 16, 2021, at 11:46 PM, Carol Zhou notifications@github.com wrote:
Hi Rob, Your custom gene-call file has 8 columns--should be 9. It is missing the "score" column, which occurs after the start/stop columns and before the strand (+/-).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Issue related to use of checkpoints. Code appears to be functioning. Thank you for using multiPhATE2 ! :-)
Thanks for all your help. I love this tool.
Following on from issue 16.
I have a custom set of gene calls that were hand curated. What should that file be named and where should it go? Then what should the proper settings in the config file be to use this as the input to the annotation engine? I have this now: filename is
PipelineInput/phate_custom.gff
Config file settings are:
custom_gene_calls='true'
custom_gene_caller_name='custom'
primary_calls='custom'
And the error is this:[Errno 2] No such file or directory: '/home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs/custom.cgc' primary calls file, /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs/custom.cgc phate_sequenceAnnotation_main says, ERROR: Check the formats of your input file(s): genome file is /home/rcohen/Documents/multiPhATE2/PipelineInput/EcCH94Phi94_contigs.fasta primary gene call file is /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs/custom.cgc outfile is /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs/phate_sequenceAnnotation_main.out gfffile is /home/rcohen/Documents/multiPhATE2/PipelineOutput/EcCH94Phi94_contigs/phate_sequenceAnnotation_main.gff
Thanks.