cfljam / galaxy-pcr-markers

Scripts for design of PCR-based Marker Assays from DNA sequence variant data, plus Galaxy xml files
7 stars 10 forks source link

Issue with design_primers.py #29

Closed rtraborn closed 9 years ago

rtraborn commented 9 years ago

Hi: I'm having trouble executing the program using your test data from your iPython notebook.

After installing primer3 and cloning your repo, I ran the command exactly as indicated: python design_primers.py -i test-data/targets.fasta -g test-data/targets.gff -T Taq1CAPS.out -p 80 -P 150 -n 1

However, I get the following error: SNP_Target_ID Position Ref_base Variant_base Amplicon_bp PRIMER_LEFT_SEQUENCE PRIMER_RIGHT_SEQUENCE ref_melt_Tm var_melt_Tm Tm_difference Traceback (most recent call last): File "design_primers.py", line 154, in <module> result=P3.run_P3(target_dict=my_target_dict) File "/nfs4shares/jumen_share/lbui/galaxy-pcr-markers/run_p3.py", line 39, in run_P3 output = sp.check_output(input_str,shell=True) File "/nfs4shares/jumen_share/lbui/galaxy-pcr-markers/run_p3.py", line 21, in f raise sp.CalledProcessError(retcode,cmd) subprocess.CalledProcessError: Command 'echo "SEQUENCE_TARGET=150,1 PRIMER_MAX_POLY_X=3 PRIMER_PAIR_MAX_DIFF_TM=1 PRIMER_MAX_GC=80.0 PRIMER_MIN_GC=20.0 PRIMER_OPT_SIZE=20 SEQUENCE_ID=k69_93535 SEQUENCE_EXCLUDED_REGION=144,2 PRIMER_MAX_SIZE=25 PRIMER_OPT_GC_PERCENT=50 PRIMER_OPT_TM=59 PRIMER_MAX_NS_ACCEPTED=1 PRIMER_GC_CLAMP=1 PRIMER_MIN_SIZE=18 PRIMER_MAX_SELF_ANY=8 PRIMER_NUM_RETURN=2 SEQUENCE_TEMPLATE=TGCTAATGACAACGAGGAGGTAGAATCTGGAGATGAATCAGACTCTTCAGTTGCTTCCTGCCCTCCTACACTTAATGAAGGAAAGAAAAAAAGGACAGGGAAGCTTCATAGGCCTTTGAGTCTGAACGCATTTGACATAATTTCCTTTTCCAGAGGATTTGATCTTTCAGGTTTGTTTGAAGAAACGGGAGATGAAACAAGATTTGTGTCGGGTGAAACGATACCAAACATCATATCGAAATTGGAGGAGATTGCAAAAGTGGGTAGTTTCACGTTTAGGAAGAAGGATTGTAGGGTTAG PRIMER_PRODUCT_SIZE_RANGE=80-150 PRIMER_MAX_SELF_END=3 =" | primer3_core ' returned non-zero exit status 255

I reproduced this error using other input data and on another machine. Could you look into this when you have the chance? Thanks!

cfljam commented 9 years ago

sorry I missed this!. will check it out ASAP

cfljam commented 9 years ago

Could you please check the version of primer 3 you are running?

ie.

primer3_core -v
cfljam commented 9 years ago

Hi there The other thing you could try would be running the above in my docker image https://github.com/cfljam/socker

Cheers J

lo2811 commented 9 years ago

Hi, It's Linh from Danforth center. I have Primer3 2.3.6 on my machine.

cfljam commented 9 years ago

I checked this as documented at http://nbviewer.ipython.org/gist/cfljam/5999f7c2975faf8a497b

This worked in the https://github.com/cfljam/socker image with both Python 2(.7) and Python 3 kernels.

Could you please check the format of your target file? It needs to be just the feature tag alone i.e.

k69_93535:SAMTOOLS:SNP:1147
cfljam commented 9 years ago

Was this run on a unix or osx or windows machine? Python version? All my dev is on Python 2.7. I suspect this is in the subprocess management..

lo2811 commented 9 years ago

Hi John,

I ran it on unix windows machine, and my fiancé tried it on a mac. I reran the script as in your suggestion, but still got the same error as before. It could be that I have Python 2.6 too, I reran with Python 2.7, no luck at all.

I guess now I'll have to use your docker image. I’ll keep you updated.

Thank you!

Linh

cfljam commented 9 years ago

Likely that the issue is in the check_output block added as a 2.6/2.7 workaround

http://stackoverflow.com/questions/4814970/subprocess-check-output-doesnt-seem-to-exist-python-2-6-5

if "check_output" not in dir(sp): # duck punch it in!
    def f(*popenargs, **kwargs):
        if 'stdout' in kwargs:
            raise ValueError('stdout argument not allowed, it will be overridden.')
        process = sp.Popen(stdout=sp.PIPE, *popenargs, **kwargs)
        output, unused_err = process.communicate()
        retcode = process.poll()
        if retcode:
            cmd = kwargs.get("args")
            if cmd is None:
                cmd = popenargs[0]
            raise sp.CalledProcessError(retcode,cmd)
        return output
    sp.check_output = f
rtraborn commented 9 years ago

Hi John: thank you for your reply and careful documentation. I reproduced Linh's error using the example you provide, so I suspect the check_output workaround you suggested will work. Could you specify precisely where the workaround code should be inserted? I use Python 3 on my VMs but can set up a sandbox elsewhere.

cfljam commented 9 years ago

Many thanks for checking what I hadnt! This is failing because all my dev was on Python 2.7. Please try it with Python2 specified (a virtualenv?) ..will look at Python 3 sub process handling approaches ASAP

rtraborn commented 9 years ago

No worries, but I should clarify- I performed the aforementioned test using python 2.6.6 on another VM that I don't use regularly. (There's a separate, more conventional error relating to spacing when I run it using python 3).

cfljam commented 9 years ago

The handling of sub processes changed between 2.6 and 2.7, hence the hack from where the error was raised. Since encountering this and other dependency headaches when trying to improve the Galaxy toolset, I have tended to work in Vagrant VMs or Docker containers to improve control over the whole!

cfljam commented 9 years ago

Im struggling to reproduce this but would love to! It works for me on OSX in a Python 2.6.9 virtualenv (conda) and with python3 . Im assuming you checked that primer3_core is in your path and you can run the example like https://umbc.rnet.missouri.edu/resources/primer3_manual.htm#example ?

cfljam commented 9 years ago

We are looking at replacing this function with a call to the much much tidier designPrimers in https://github.com/benpruitt/primer3-py by @benpruitt

http://nbviewer.ipython.org/gist/cfljam/a23baebc796d98363c69

In the meantime I would suggest pulling the repo into a Docker container from https://github.com/cfljam/socker. The Jupy notebooks are fantastic for documenting design

cfljam commented 9 years ago

Could you please try out the dev branch https://github.com/cfljam/galaxy-pcr-markers/tree/dev which uses primer3-py?

rtraborn commented 9 years ago

We resolved the issue yesterday- it was indeed due to a problem with primer3_core . Here's why it happened: primer3 looks for the thermodynamic parameters in primer3_config/. By default it looks in two locations for this: /opt/ and ./primer3_config. In our hands, compiling primer3_config didn't automatically place primer3_config/ in /opt/. As a workaround, we just copied the folder and all of its contents to the galaxy-pcr-markers/test/ directory, as follows:

cd primer3-2.3.6/
cd src/
cp -r primer3_config/ ~/galaxy-pcr-markers/test-data
cd ~/galaxy-pcr-markers/test-data

We then completed the test example, which then worked like a charm.

python ../design_primers.py -i targets.fasta -g targets.gff -T Taq1CAPS.out -p 80 -P 150 -n 1

SNP_Target_ID Position Ref_base Variant_base Amplicon_bp PRIMER_LEFT_SEQUENCE PRIMER_RIGHT_SEQUENCE ref_melt_Tm var_melt_Tm Tm_difference
k69_93535:SAMTOOLS:SNP:1147 1147 C G 132 GGAAGCTTCATAGGCCTTTGAG ATGTTTGGTATCGTTTCACCCG 0 0 0

I suspect this issue arose because of the changes to the default parameters of primer3 in recent years (see: http://emboss.open-bio.org/pipermail/emboss/2012-September/004304.html), although I haven't evaluated this. There's a separate discussion of this issue on a BioPerl forum here: http://www.perlmonks.org/?node_id=952518.

Thanks for all of your help and careful, clear documentation so that we could get to the bottom of the issue.

cfljam commented 9 years ago

Many thanks for this. I had already encountered this as noted in https://github.com/cfljam/galaxy-pcr-markers/issues/5

The changes in Primer3 parameters have also caused some headaches. I think that replacing my subprocess hack with the better crafted primer3-py will help catch these errors better.

cfljam commented 9 years ago

Replacement with primer3-py means that a standalone primer3 installation is no longer required!