rdpstaff / Framebot

Dynamic programming based frame shift detection and correction tool with nearest neighbor classification.
GNU General Public License v3.0
7 stars 7 forks source link

Question about FrameBot index #1

Closed fescudie closed 10 years ago

fescudie commented 10 years ago

Hello,

I try to use FrameBot1.0 and i have a problem with index. When i use framebot with index on example data all sequences are failed. When i do not use index many sequences are corrected.

Test with index

Commands : java -jar /usr/local/bioinfo/src/FrameBot/current/dist/FrameBot.jar index /usr/local/bioinfo/src/FrameBot/current/example/nifH_test_refseq_prot.fa nifH_test.index java -jar /usr/local/bioinfo/src/FrameBot/current/dist/FrameBot.jar framebot -o nifH_test nifH_test.index /usr/local/bioinfo/src/FrameBot/current/example/nifH_test_query.fa

Results : -rw-r--r-- 1 owner GROUP 0 25 janv. 09:48 nifH_test_corr_nucl.fasta -rw-r--r-- 1 owner GROUP 0 25 janv. 09:48 nifH_test_corr_prot.fasta -rw-r--r-- 1 owner GROUP 56912 25 janv. 09:48 nifH_test_failed_framebot.txt -rw-r--r-- 1 owner GROUP 17460 25 janv. 09:48 nifH_test_failed_nucl.fasta -rw-r--r-- 1 owner GROUP 0 25 janv. 09:48 nifH_test_framebot.txt -rw-r--r-- 1 owner GROUP 4149 25 janv. 09:46 nifH_test.index

Test without index

Command : java -jar /usr/local/bioinfo/src/FrameBot/current/dist/FrameBot.jar framebot -N -o nifH_test_withoutIndex /usr/local/bioinfo/src/FrameBot/current/example/nifH_test_refseq_prot.fa /usr/local/bioinfo/src/FrameBot/current/example/nifH_test_query.fa

Results : -rw-r--r-- 1 owner GROUP 16549 25 janv. 09:50 nifH_test_withoutIndex_corr_nucl.fasta -rw-r--r-- 1 owner GROUP 6137 25 janv. 09:50 nifH_test_withoutIndex_corr_prot.fasta -rw-r--r-- 1 owner GROUP 1156 25 janv. 09:50 nifH_test_withoutIndex_failed_framebot.txt -rw-r--r-- 1 owner GROUP 359 25 janv. 09:50 nifH_test_withoutIndex_failed_nucl.fasta -rw-r--r-- 1 owner GROUP 56090 25 janv. 09:50 nifH_test_withoutIndex_framebot.txt

Can you explain me this difference ?

wangqion commented 10 years ago

The README on the FrameBot GitHub repository has more detailed explanation. Basically FrameBot builds an index file based on the input DNA sequences. And dhe reference DNA sequences should cover the exact same protein-coding region. When you build the index, you can use the example nifH_test_refseq_nucl.fa for testing. We also provide a pre-built nifH index file in refset/nifh_nucl_ref_polyprimers_g13f10e4.index.