biologger / speciesprimer

The SpeciesPrimer pipeline is intended to help researchers finding specific primer pairs for the detection and quantification of bacterial species in complex ecosystems.
GNU General Public License v3.0
39 stars 19 forks source link

Error: "No .gff files found for QualityControl rRNA" #11

Closed amoghpj closed 3 years ago

amoghpj commented 3 years ago

Hi there!

Thanks for this package. I am trying to run the pipeline directly from the command line to try and understand the components. I have the nt database downloaded. I am currently trying to get primers for three species (Acinetobacter baylyi Brachybacterium rhamnosum Corynebacterium amycolatum). I can get SpeciesPrimer to download the fna files for these species. However, the script errors out, with "No .gff files found for QualityControl rRNA" for each of the species, at the QC stage. (Sure enough, the /gff_files folders are empty.) I looked through the code and found that the FTP link is constructed for the fna files on line 242 of speciesprimer.py, but I haven't found an explicit construction of the gff URLs.

Question: Where in the code are the gff files downloaded, if at all?

I can share the config files if they are useful for debugging. Let me know if I am missing something obvious.

Thank you for your time!

biologger commented 3 years ago

Hi,

The gff files are not downloaded, the annotation using prokka generates the gff and ffn files for the subsequent steps. The pipeline copies these files from the prokka output directories to the /gff_files directory.

Does "directly from the command line" mean that you are not using the docker container? This is currently not supported and you would need to install all dependencies manually. I would recommend to work with the command line inside the docker container. For development there is also a way to link the local scripts into the container so there is no need to rebuild the container each time.

amoghpj commented 3 years ago

Thank you for your response! I was wondering where the prokka output was being used, this makes sense. I am currently unable to use Docker due to the setup of our compute cluster. I tried using Singularity to build the image, but that hasn't worked so far.

Thanks again, I'll see if there is any other way that I can run the docker image.

biologger commented 3 years ago

For usage with Singularity you may have to additionaly mount/bind the speciesprimer directory (clone from Github) as the containers are read-only and the pipeline stores the temporary config in the /pipeline directory.

Download of the genome assemblies and annotation worked for me using the following command: $ singularity shell -B /home/user/primerdesign:/primerdesign,/home/user/blastdb:/blastdb,/home/user/speciesprimer/pipeline:/pipeline docker://biologger/speciesprimer