Issue with sequence_based_ir_id.pl result

mrmckain / Fast-Plast

Automated de novo assembly of whole chloroplast genomes.

MIT License

43 stars 14 forks source link

Issue with sequence_based_ir_id.pl result #43

Open nsmt89 opened 4 years ago

nsmt89 commented 4 years ago

Hi, I tried using sequence_based_ir_id.pl for my assembly that I assembled from different software, but it only give me 3 regions (ir, ir, sc). The IRs are not even the same length (however only differ about ~1k bp) but the lengths of those regions seem logical for my genome (for LSC and IR) as it is near to related available chloroplast genome. Can I still locate the SSC and produce the full plastome with the right orientation? Do you have any suggestion? CN_NOVO_MITO_1_regions_split0.txt

mrmckain commented 4 years ago

The start/stops of putative regions suggest that the script parameters used we not optimal. Instead of using 0, you can try other values like 1 or 2. You can also use another parameter (entered after the 0, 1 or 2), which define the minimum size for a region to be considered sc or ir. The default it 10,000. Try 2000 and see what your results are.

mrmckain commented 4 years ago

Syntax: perl sequence_based_ir_id.pl sequence_file_to_split name_base skip_parameter min_region_size

mrmckain commented 4 years ago

I see you have been struggling with your assembly across GetOrganelle and NOVOplasty as well. Send me the file you are putting into sequence_based_ir_id.pl and I will troubleshoot. I can get a sense of what is going on more quickly than a back and forth.

nsmt89 commented 4 years ago

Thank you very much for noticing. Can I email my fasta file directly?

mrmckain commented 4 years ago

Np. mrmckain at ua dot edu

nsmt89 commented 4 years ago

I have emailed my fasta file. Thank you.