joybio / multiPrime

multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
MIT License
365 stars 37 forks source link

multiPrime result #17

Open tobbyxy opened 3 months ago

tobbyxy commented 3 months ago

Hi @joybio thank you for your package.

In the results folder of multiPrime for the CDS test data, there are two final max forward and reverse primer from just one cluster. Is it expected that the final results will have only 1 cluster, should I be concerned when the input sequences are considerably different?

tobbyxy commented 3 months ago

I saw this from the paper " The primary output will encompasses two ultimate primer sets and the cluster specific primers". Does it mean only one cluster is considered for the ultimate primer set? what about the primers from the other clusters. I'm asking because I'm getting only one cluster. Thanks

joybio commented 3 months ago

@tobbyxy For large input sequence sets, it's common to encounter sequences that cannot be clustered, which may result in a very large final primer set. To manage this, we filter out clusters with short sequences, creating what we refer to as the core primer set. The situation where only one primer set is obtained occurs when the final primer set is generated, but no core primer set is produced. This is because the default filter threshold requires clusters to have at least 10 sequences to qualify as a core primer set. Hence, if no cluster meets this threshold, only the final primer set will be available.

tobbyxy commented 3 months ago

Thank you for your response @joybio I have some follow up questions. "Hypothetically, If I have 3000bp sequence as input, with amplicon size (product length) set to 1000bp, will the primer set designed amplify the 3000-sequence length or the amplicon only (1000bp)? we are interested in using this tool to design primers for multiple regions within the same gene to cover the entire length of the gene. So we want to know if 1) we can have multiple amplicons (targeting different regions) for individual clusters? 2) MultiPrime currently reports the number of sequences covered in the primer, can we access the coverage breadth.?

I appreciate your response.

joybio commented 3 months ago

@tobbyxy

  1. Regarding your hypothetical scenario of using a 3000bp sequence as input with an amplicon size (product length) set to 1000bp: The primer set will amplify a product of 1000bp, which is the specified amplicon size. It will not cover the entire 3000bp sequence in one amplicon. However, in MultiPrime, the PCR product size can be set to a range, with the default range being 150-1200bp.
  2. About targeting multiple regions for individual clusters: Yes, you can generate multiple amplicons targeting different regions within each cluster. MultiPrime outputs a candidate primer list for each cluster, which you can find in the or directories. Below is a breakdown of the relevant directories and files:

    Clusters_cprimer:

    Contains candidate primers (spanning from the start to stop positions) for each cluster, available in both .fa and .txt formats.

    Primers_set:

    Contains all candidate primers for each cluster, along with sorted primers and coverage statistics. Additionally, directories such as Clusters_target provide detailed information about each cluster, and the PCR_product folder includes the perfect PCR products for each primer. Here is an example: 1724373654320

All the candidate primers are listed like this.

  1. Regarding access to coverage breadth: Yes, the coverage breadth can be set by the user in the YAML configuration file (e.g., in the PRODUCT_size parameter). This allows you to define the range of PCR product sizes, ensuring your primers meet the desired coverage across the targeted regions.