Closed znelson999 closed 1 month ago
Hi, Thank you for reporting.
Can you please post the command you used?
thanks, Pooja
Here you are
module load python_3/3.11.1
python -m venv /project/cricket_gen/ZachN/Virtual/EGAP source /project/cricket_gen/ZachN/Virtual/EGAP/bin/activate pip install -r /project/cricket_gen/ZachN/EGAPx/egapx-main/ui/requirements.txt
module load nextflow/23.04.3 module load singularityCE/3.11.4
python3 /project/cricket_gen/ZachN/EGAPx/egapx-main/ui/egapx.py /project/cricket_gen/ZachN/EGAPx/egapx-main/examples/input_D_farinae_small.yaml
python3 /project/cricket_gen/ZachN/EGAPx/egapx-main/ui/egapx.py /project/cricket_gen/ZachN/EGAPx/egapx-main/yamlFiles/oldTenebrioMol.yaml -e singularity -w /project/cricket_gen/ZachN/Virtual/WorkingDirectory -o /project/cricket_gen/ZachN/Virtual/EGAPOutputEditedTenebrioMol
The following is submitted via .sh.
Currently, the read files have to end in .1, .2 to be paired up. In future, we will make it more flexible.
To be able to read the gz read files, edit the ui/assets/default_task_params.yaml
:
Under star_wnode:
is a -star-params
To the argument list, add this --readFilesCommand zcat
Also
python3 /project/cricket_gen/ZachN/EGAPx/egapx-main/ui/egapx.py /project/cricket_gen/ZachN/EGAPx/egapx-main/examples/input_D_farinae_small.yaml -o outdir
for your first egapx run.
Hello,
I have modified the default_tasks_params.yaml file but am still getting an index issue. The error I'm getting is the same as before
ERROR ~ index is out of range 0..-1 (index = 0)
Any other things I can try to fix this?
Hi, The best thing to do is to wait for our next version update. We are actively working on this issue that your brought up. Currently, it's also having trouble reading gz read files.
We appreciate your testing and reporting. We'll reach out when the fix is ready.
Pooja
Very well, thank you for assistance.
@znelson999 can you supply me with a short sample of the FASTQ files you use - say first 100 lines of first 4 files in your list (after zcat'ting it of course)? By using something like zcat /project/cricket_gen/ZachN/EGAPx/TenebrioMolTranscriptome/Tmol_RNA_transcriptome_lifestages/t3_ll_1_R1.fastq.gz | head -100 > t3_ll_1_R1.fastq.sample I want to ensure that our code works with the output from a real sequencing machine (which I assume it is) vs. processed reads from SRA.
Hi @victzh
Attached is a zip file with the portions you requested, let me know if this helps.
@znelson999 thanks! It seems to me that t3_ll_1_R1.fastq.gz and t3_ll_1_R2.fastq.gz are parts of a paired run. Why the samples then match each other exactly? Shouldn't they be different ends of the same piece of RNA? I'm not a biologist, I'm a programmer, so pardon my ignorance.
@victzh
Thanks for pointing this out. I made a mistake when making those sample files. Attached are the appropriate zipped files. RealFastqFiles.zip
@znelson999 thanks, it helps a lot. I tried to run newer version with your data but it failed (so far) because the samples are too short. We need to fix this and will see what else fails.
Hello! I'm encountering same "index is out of range" error while running egapx with local RNA-Seq data. I've tried renaming my unzipped files to {prefix}.1, {prefix}.2, {prefix}.1.fastq, and {prefix}.2.fastq, but the error persists.
Here's the error message:
ERROR ~ index is out of range 0..-1 (index = 0)
-- Check script 'nf/./subworkflows/ncbi/./rnaseq_short/star_wnode/main.nf' at line: 83 or see '/media/eternus1/data/vgp/glis_glis/users/zilov/annotation/egapx/egapx/glis_out_last/nextflow.log' file for more details
I installed egapx following the README instructions and am running it within the nextflow conda environment. Test run using the example data completed successfully.
Is there a way to troubleshoot this now? Should I wait for the next version of the tool, or is the problem likely on my side?
Input file: input_yaml2.txt
Hi @zilov It looks like the underscores in the filenames is causing the problem. We are working on this. For now, you could remove the underscores and give it a try. Pooja
Thank you, that works!
Hello,
I'm attempting to include RNA-seq data into my run and am getting an array error. The sequences are locally hosted and not in NCBI. Included is the error I'm getting and the yaml file being used.
yamlFileUsed.txt
EGAPissue.txt
The wording used in the documentation says that RNA-seq files have to end in 1 or 2; could that be causing my problems? If so is there a work around to be able to use data that is not on NCBI? Or could it just be an error in the way I've set up the required files?
Thank you for any assistance you can provide.