jiqingxiaoxi / GLAPD

design common and specific primer for LAMP using the whole genome
GNU General Public License v2.0
11 stars 8 forks source link

This is a customizable LAMP primer sets designing system: GLAPD (whole genome based LAMP primer design for a set of target genomes).

Introduction: LAMP(Loop-mediated isothermal amplification) is a simple and effective new method to amplify DNA sequence. One LAMP primer set contains four single LAMP primers from six primer regions in genome. The six regions are F3, F2, F1c, B1c, B2 and B3. Sequences from F1c and F2 are synthesized into primer FIP and sequences from B1c and B2 are synthesized into BIP. In order to accelerate the amplification, two additional loop primers(LF, LB) can be added.

GLAPD can design LAMP primer sets based on whole genome. It can design common primers for a set of target genomes and the primers are specific for background genomes. Firstly, GLAPD identifies all candiate single primer regions. Then those single primers are aligned to target genomes and background genomes. Thirdly, GLAPD combines candidate single primers into LAMP primer sets. At last the commonality and specificity check are calculated for LAMP primer sets with the information from alignment.

SYSTEM REQUIREMENTS: GLAPD now runs under Linux operation system, it needs perl and gcc. If run GPU version, the computer capability of GPU >=2.0.

OTHER SOFTWARES MAY BE NEEDED Bowtie, which could be downloaded from http://bowtie-bio.sourceforge.net/index.shtml CUDA driver, which could be downloaded from http://www.nvidia.com

Files and Directories: This system is packaged in one file, "GLAPD.tar.gz", which can be downloaded from http://cgm.sjtu.edu.cn/GLAPD/ and https://github.com/jiqingxiaoxi/GLAPD2.git. Important files inside this tar.gz file are listed here.

|-Par/ tm_nn_parameter.txt Parameters for calculating Tm. stab_parameter.txt Parameters for calculating the stability of single primers. .db, .ds Sixteen files for calculating the secondary structure of primers. |-par.pl Get the information of positions and mismatches for each single primer. |-single.c The CPU version of GLAPD for identify single primer regions. |-LAMP.c The CPU version of GLAPD for design LAMP primer sets. |-Makefile Makefile for CPU version. |-GPU/ single.cu The GPU version of GLAPD for identify single primer regions. LAMP.cu The GPU version of GLAPD for design LAMP primer sets. Makefile Makefile for GPU version. |-bowtie/ bowtie The bowtie program bowtie-build The bowtie-build program |-example/ example.fa Sequences as example, from "NC_002951.2 Staphylococcus aureus subsp. aureus COL chromosome". target-list.txt The list of names from target genomes. It contains 3 strains of S. aureus. background-list.txt The list of names from background genomes. It contains 3 strains of other bacteria. index.* The index files for Bowtie software.

INSTALLATION: tar -zxvf GLAPD.tar.gz If you want to use CPU version: cd GLAPD/ make If you want to use GPU version: cd GLAPD/GPU/ make

QUICK START:

  1. If you want design LAMP primers for a sequence without taking care of commonality and specificity: cd GLAPD/ ./Single -in example/example.fa -out Test (Two files "Inner/Test" and "Outer/Test" are created.) ./LAMP -in Test -ref example/example.fa -out success.txt (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

  2. If you want design common LAMP primers without taking care of specificity: cd GLAPD/ ./Single -in example/example.fa -out Test (Two files "Inner/Test" and "Outer/Test" are created.) perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt (Three files "Inner/Test-common_list.txt", "Inner/Test-common.txt" and "Outer/Test-common.txt" are created.) ./LAMP -in Test -ref example/example.fa -out success.txt -common (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

  3. If you want design specific LAMP primers without taking care of commonality: cd GLAPD/ ./Single -in example/example.fa -out Test (Two files "Inner/Test" and "Outer/Test" are created.) perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --specific example/background-list.txt
    (Two files "Inner/Test-specific.txt" and "Outer/Test-specific.txt" are created.)
    ./LAMP -in Test -ref example/example.fa -out success.txt -specific
    (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

  4. If you want design common and specific LAMP primers: cd GLAPD/ ./Single -in example/example.fa -out Test (Two files "Inner/Test" and "Outer/Test" are created.) perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt --left (Five files "Inner/Test-common_list.txt", "Inner/Test-common.txt", "Inner/Test-specific.txt", "Outer/Test-common.txt" and "Outer/Test-specific.txt" are created.) ./LAMP -in Test -ref example/example.fa -out success.txt -common -specific (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

    RUN THE SYSTEM: 1.Identify candidate single primer regions: Command: Single -in -out [options]*

Arguments: -in reference genome, fasta formate -out output the candidate single primers -dir the directory for output file default: current directory -loop identifiy candidate single primer regions for loop primers -check check single primers' secondary structure or not 0: don't check secondary structure; other values: check default: 1 -par parameter files under the directory are used to check primers' secondary structure default: GLAPD/Par/ -h[-help] print usage

2.Align sequences from single primer regions(optional): Command: perl par.pl --in --ref --common[--specific] --bowtie --index [options]*

Arguments: --in the file name of candidate single primer regions, files are generated from Single program --ref reference genome, fasta formate --dir dirctory for files of candidate single primer regions default: current directory --loop include loop primers --common the genomes in the file(target genomes) are expected to be amplified by LAMP primer sets --specific the genomes in the file(background genomes) are not expected to be amplified by LAMP primer sets --left background_group = all_genome_in_database - target_group used with --common invalid if exist --specific --bowtie the bowtie program --index bowtie index file name, comma-separated --mis_s the max number of mismatches allowed when align single primers to background genomes this value between 0 and 3. the bigger of the value, the more specific default: 2 --mis_c the max number of mismatches allowed when align single primers to target genomes this value between 0 and 3. the smaller of the value, the more common default: 0 --threads number of threads to launch when align default: 1 --help|--h print help information

3.Design LAMP primer sets: Command: LAMP -in -ref -out [options]*

Arguments: -in the file name of candidate single primer regions, files are generated from Single program -ref reference genome, fasta formate -dir the directory for output file default: current directory -out output successfully designed LAMP primer sets -num the expected output number of LAMP primer sets default: 10 -loop design LAMP primer sets with loop primers -common design common LAMP primer sets those can amplify more than one target genomes -specific design specific LAMP primer sets those can't amplify any background genomes -check check primers' tendency of binding to another in one LAMP primer set or not 0: don't check; other values: check default: 1 -par parameter files under the directory are used to check primers' binding tendency default: GLAPD/Par/ -fast fast mode to design LAMP primer sets, in this mode GLAPD may lost some right results -h/-help print usage

INPUT FILE FORMAT: 1) The reference genome must be fasta format. 2) The common list file and specific list file used in step 2 must be one genome name per line. They can be generated by taking the sequence names from target or background genomes directly. 3) The bowtie index can be generated by "bowtie-build" command, more details in http://bowtie-bio.sourceforge.net/tutorial.shtml. OUTPUT FILE FORMAT: 1) In the files generated by "Single" program, each line means a candidate single primer. For example: "pos:3 length:25 +:2 -:0 61.89" Each line has five fields seperated by tabs. From left to right, the fields are:

  1. The position of the single primer in reference genome (0-based)
  2. The length of the single primer 3,4. Sum of all applicable flags. "+" means the primer from the plus strand of reference genome and "-" means the primer from minus strand. If the number is "0", the single primer isn't from the plus or minus strand of reference genome. Flags are: 1: this single primer can be used to amplify a target if the GC-content of target region is >=60% 2: this single primer can be used to amplify a target if the GC-content of target region is <=45% 4: this single primer can be used to amplify a target if the GC-content of target region is between 45% and 60%
  3. Tm 2) In the files generated by "par.pl" ("XX-common.txt" and "XX-specific.txt), each line stores an alignment. For example: "3 25 2 501407 1 0" Each line has six fields seperated by tabs. From left to right, the fields are:
  4. The position of the single primer in reference genome (0-based)
  5. The length of the single primer
  6. The genome turn in target group or background group (0-based)
  7. The position of alignment in this genome(field 3)
  8. "1" means this single primer can be used to amplify the genome (field 3) in plus strand. "0" means this single primer can't be used to amplify the genome (field 3) in plus strand.
  9. "1" means this single primer can be used to amplify the genome (field 3) in minus strand. "0" means this single primer can't be used to amplify the genome (field 3) in minus strand. 3) In the file generated by "par.pl" ("XX-common_list.txt"), each line contains one target genome. For example: "NC_002951.2 0" Each line has two fields seperated by tab. From left to right, the fields are:
  10. The name of target genome
  11. The turn of target genome in target group (0-based) 4) The LAMP primer set is stored in the file generated by "LAMP" program, for example: "The 1 LAMP primers: F3: pos:36,length:18 bp, primer(5'-3'):CGGTTCCCTGTACTCGAA F2: pos:74,length:19 bp, primer(5'-3'):AATTCCTTTGTTGAGGCCG F1c: pos:115,length:24 bp, primer(5'-3'):CGAAATCTTCAAACACTACGTGCT B1c: pos:175,length:22 bp, primer(5'-3'):CCTGACGGAAGCAGCATTAAGT B2: pos:237,length:19 bp, primer(5'-3'):CGAACGTAACCAAAGTCGT B3: pos:276,length:23 bp, primer(5'-3'):TAAAAAATAAAAAACCGTGCACC This set of LAMP primers could be used in 3 genomes, there are: NC_002951.2, NC_017340.1, NC_002745.2" One LAMP primer set contains at least six single primers. The positions of single primers in reference genome and their length, sequence are listed in this file. When user designs common primers, which target genomes can be amplified by this primer set are also listed in this file.

    TIPS: 1) Select reference genome: The reference genome can be select one randomly from the group of target genomes, or the most expected genome amplified by the LAMP primer set. 2) Specific file: When run the step 2, if you have the "common file", you can use "--left" option to replace the "--specific". In this way, all genomes in database expect for those in "common file" are defined as the background genomes. 3) Fast mode: When run the step 3, use the "-fast" option can accelerate the designing. But this mode may lost some right LAMP primer sets.

    If you have any questions, please contact with us: Ben Jia: chenmodexiaoxi@126.com Chaochun Wei: ccwei@sjtu.edu.cn