Closed Artifice120 closed 5 months ago
Hi, Can you please confirm that you are using version v0.1.2-alpha from github?
Yes, I used the git clone command 4 days ago to download egapx .
Just ran the example data and this set ran fine.
I am using a node with 56 CPU and 2000 GB of RAM/Memory as well.
Could you try with SRA data (using reads_ids option)? I see one, SRR1821979 (for Uroleucon ambrosiae). We think it is failing because it cannot find a protein set for the taxid you used, and the absence of SRA. Let me know how it goes. Thanks!
Thanks, will try that out.
Also the singularity option seems to work fine in a Slurm scheduler as long as you activate the Conda and virtulaenv environments in the sbatch file itself and cd out of the default path.
EDIT: Nextflow, pip, and Python had to be run in a conda environment saved in the same directory in seperate folders that the virtual-env was saved to to run in slurm without path errors. With the conda and virtual-env activated seperately
To save/create a conda env in a diffrent folder than the default use the following command format :
conda create -p path/to/wanted/env/location
With the SRA_ids option this runs to completion.
Amazing, The raw predicted genes are in the same order of magnitude as what was expected (~17,000).
This anecdotally seems faster and more accurate than the Braker3 pipeline I have been using.
🎉🎉🎉
Glad to hear that worked. Note we're still adding features to better characterize the annotation and add protein support for more organisms including aphids. So you'll want to run this again in a few months as the pipeline matures. More RNA-seq would be nice, but we do often see reasonable results with surprisingly small datasets.
Afternoon,
While running the following command on a genome assembly ;
I received the following error;
Seems to be an issue where gnomon has now output.
Here is the configuration file I used.