Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.
GNU General Public License v3.0
205 stars 28 forks source link

Need information related to example given in NextPolish Software under the folder test_data #106

Closed DrNavi closed 1 year ago

DrNavi commented 1 year ago

Hi all,

I am trying to use NextPolish tool to polish my draft assembly. First I wanted to understand about the example data provided in the tool under the folder test_data. The folder contain these files: ~/NextPolish/test_data/hifi.fasta.gz ~/NextPolish/test_data/hifi.fofn ~/NextPolish/NextPolish/test_data/hifi.run.cfg ~/NextPolish/NextPolish/test_data/lgs.fofn ~/NextPolish/NextPolish/test_data/lreads.fasta.gz ~/NextPolish/NextPolish/test_data/raw.genome.fasta ~/NextPolish/NextPolish/test_data/run.cfg ~/NextPolish/NextPolish/test_data/sgs.fofn ~/NextPolish/NextPolish/test_data/sreads.R1.fastq.gz ~/NextPolish/NextPolish/test_data/sreads.R2.fastq.gz

Here, In this file list, I am confused what is raw.genome.fasta file, Is this is the reference genome of the target specie from any database like NCBI, Ensemble etc ? this raw.genome.fasta file is used in run.cfg. Here is the run.cfg file content....

[General] job_type = local job_prefix = nextPolish task = default rewrite = yes deltmp = yes rerun = 3 parallel_jobs = 2 multithread_jobs = 3 genome = ./raw.genome.fasta genome_size = auto workdir = ./01_rundir polish_options = -p {multithread_jobs}

[sgs_option] sgs_fofn = ./sgs.fofn sgs_options = -max_depth 100

[lgs_option] lgs_fofn = ./lgs.fofn lgs_options = -min_read_len 5k -max_depth 100 lgs_minimap2_options = -x map-ont

Please, help me to understand this......

Another question is if I have only long read then do we only need to avoid/eliminate the sgs_option commands???

Looking forward to hearing from you

Regards, Dr. Naveed

moold commented 1 year ago

raw.genome.fasta is the assembly file to be polished. for Another question is if I have only long read then do we only need to avoid/eliminate the sgs_option commands??? Yes.

moold commented 1 year ago

see https://nextpolish.readthedocs.io/en/latest/OPTION.html

DrNavi commented 1 year ago

Dear Hu Jiang,

Thank you so much for clearing my confusions. I just have one more question. The sequencing company provide me only one nanopore fasta file named "KI.fasta.tar.gz". I did not get two long read file as mentioned in the example data. So in .fofn file I can only mention one read file. is that OK? Please, clear my confusion Thanks Regards, Dr. Naveed

Sent from Yahoo Mail. Get the app

On Wednesday, February 15, 2023 at 06:09:47 PM GMT+9, Hu Jiang ***@***.***> wrote:  

raw.genome.fasta is the assembly file to be polished. for Another question is if I have only long read then do we only need to avoid/eliminate the sgs_option commands??? Yes.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

moold commented 1 year ago

Yes, but you need to decompress KI.fasta.tar.gz first.