Error running align - Githubissues

Rdebar commented 4 years ago

Hi everyone! I'm trying to run READemption for the first time on Ubuntu with the example data (also, I'm not an expert in bioinformatics) and I'm getting this error:

[E::hts_open_format] Failed to open file "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" : No such file or directory Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 294, in align_reads controller.align_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 81, in align_reads self._align_single_end_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 328, in _align_single_end_reads paired_end=False) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/readaligner.py", line 20, in run_alignment paired_end=paired_end) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/segemehl.py", line 75, in align_reads catch_stdout=False) File "/home/pvc/.local/lib/python3.6/site-packages/pysam/utils.py", line 75, in call stderr)) pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" for reading: No such file or directory\n'

Following the advise that other people received, I'm trying to install segemehl to 0.3.4, but I don't know how to do it just by following the instructions provided here: https://reademption.readthedocs.io/en/latest/installation.html I have seen that the files obtained from unzipping it are different to 0.2.0 version. Maybe that's the reason I cannot install it following the instructions I mentioned.

Please, any advise will be more than welcome. Thanks!!

Tillsa commented 4 years ago

Hi! you can use conda to install segemehl 0.3.4 (which is required for READemption version 0.6.0): conda install -c bioconda segemehl And make sure you add segemehl or rather the conda bin folder to your path. You can do so by adding a line similar to this one export PATH="$HOME/anaconda/bin:$PATH" to your bash profile file like in the answer of this stack overflow question: https://stackoverflow.com/questions/35076536/i-have-to-type-export-path-anaconda-binpath-everytime-i-rerun-the-terminal

Best regards,

Till

Rdebar commented 4 years ago

Thanks a lot for the response. Now I downloaded the right version, but still I'm trying to do the following instructions for the installation after decompressing:

cd segemehl_*/segemehl/ && make && cd ../../

and it says that there is no such file or directory. Would it be different for the 0.3.4 version of segemehl?

Best, Rubén

Tillsa commented 4 years ago

After installing segemehl via conda you don't need to to use the make command. However, if you want to download and built segemehl from https://www.bioinf.uni-leipzig.de/Software/segemehl/ you need to make sure you are in the folder where segemehl's make file is located. Best, Till

Rdebar commented 4 years ago

Cool, then segemehl mus tbe in place now. Now a new error message when running it:

pvc@DamienPC:~/Escritorio/Ruben$ reademption align -p 4 --poly_a_clipping -f READemption_analysis [E::hts_open_format] Failed to open file "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" : No such file or directory Traceback (most recent call last): File "/home/pvc/.local/bin/reademption", line 320, in main() File "/home/pvc/.local/bin/reademption", line 284, in main args.func(controller) File "/home/pvc/.local/bin/reademption", line 294, in align_reads controller.align_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 81, in align_reads self._align_single_end_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 328, in _align_single_end_reads paired_end=False) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/readaligner.py", line 20, in run_alignment paired_end=paired_end) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/segemehl.py", line 75, in align_reads catch_stdout=False) File "/home/pvc/.local/lib/python3.6/site-packages/pysam/utils.py", line 75, in call stderr)) pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" for reading: No such file or directory\n'

Thanks a million for your help! Best, Rubén

Tillsa commented 4 years ago

Please run $ reademption --version and let me know which version you are using.

Rdebar commented 4 years ago

READemption version 0.6.0

Tillsa commented 4 years ago

Ok, that is the right version. Can you please post the output of find READemption_analysis

Rdebar commented 4 years ago

pvc@DamienPC:~/Escritorio/Ruben$ find READemption_analysis READemption_analysis READemption_analysis/input READemption_analysis/input/reference_sequences READemption_analysis/input/reference_sequences/NC_017718.fa READemption_analysis/input/reference_sequences/NC_017720.fa READemption_analysis/input/reference_sequences/NC_017719.fa READemption_analysis/input/reference_sequences/NC_016810.fa READemption_analysis/input/annotations READemption_analysis/input/annotations/NC_017718.gff READemption_analysis/input/annotations/NC_017720.gff READemption_analysis/input/annotations/NC_017719.gff READemption_analysis/input/annotations/NC_016810.gff READemption_analysis/input/reads READemption_analysis/input/reads/InSPI2_R1.fa.bz2 READemption_analysis/input/reads/InSPI2_R2.fa.bz2 READemption_analysis/input/reads/LSP_R1.fa.bz2 READemption_analysis/input/reads/LSP_R2.fa.bz2 READemption_analysis/output READemption_analysis/output/align READemption_analysis/output/align/unaligned_reads READemption_analysis/output/align/processed_reads READemption_analysis/output/align/processed_reads/InSPI2_R1_processed.fa.gz READemption_analysis/output/align/processed_reads/LSP_R1_processed.fa.gz READemption_analysis/output/align/processed_reads/InSPI2_R2_processed.fa.gz READemption_analysis/output/align/processed_reads/LSP_R2_processed.fa.gz READemption_analysis/output/align/alignments READemption_analysis/output/align/index READemption_analysis/output/align/index/index.idx READemption_analysis/output/align/reports_and_stats READemption_analysis/output/align/reports_and_stats/version_log.txt READemption_analysis/output/align/reports_and_stats/stats_data_json READemption_analysis/output/align/reports_and_stats/stats_data_json/read_processing.json READemption_analysis/output/viz_gene_quanti READemption_analysis/output/viz_align READemption_analysis/output/coverage READemption_analysis/output/coverage/coverage-tnoar_mil_normalized READemption_analysis/output/coverage/coverage-raw READemption_analysis/output/coverage/coverage-tnoar_min_normalized READemption_analysis/output/deseq READemption_analysis/output/deseq/deseq_raw READemption_analysis/output/deseq/deseq_with_annotations READemption_analysis/output/viz_deseq READemption_analysis/output/gene_quanti READemption_analysis/output/gene_quanti/gene_quanti_per_lib READemption_analysis/output/gene_quanti/gene_quanti_combined

Tillsa commented 4 years ago

I just ran the tutorial on my machine and had no problems. I used a bash script to do so, which I uploaded run_reademption_tutorial.sh. Could you please use the script to run the example analysis. You need to change line 2 and line 3 where it says: readonly READEMPTION=/home/till/Documents/READemption_developing/0.6.0/READemption/bin/reademption and readonly READEMPTION_ANALYSIS_FOLDER=READemption_analysis according to your system. Does the error persist?

Rdebar commented 4 years ago

Ok this is awkward for me already... I'm running the script substituting line 2 with

readonly READEMPTION=/home/pvc/Escritorio/Ruben/

which is where I'm doing the analysis, but it reports a warning for line 20 saying that the file or directory does not exist. Does it happen to refer to the reademption script? Because I cannot find it in that directory after the installation.

Thanks a lot for your patience. Rubén

Tillsa commented 4 years ago

I think the right line in your case would be similar to this one: readonly READEMPTION=/home/pvc/Escritorio/Ruben/READemption/bin/reademption

Rdebar commented 4 years ago

It reports the same in both cases. Might I have done something wrong in the installation?

Tillsa commented 4 years ago

I don't think you have done something wrong during installation. But you could post all the commands you ran for installing and I have a look and let you know if something is missing.

Maybe the following issue, where someone had a similar error helps you. https://github.com/PacificBiosciences/FALCON_unzip/issues/48 "I have had this exact same problem recently. It turned out to be that the system limits the number of opened files. After increasing the limit, the error is gone."

Best, Till

Rdebar commented 4 years ago

There might be something I missed. After updating segemehl, I tried to run these lines of the installation dating them to the new version name (segemehl-0.3.4):

sudo cp segemehl_0_2_0/segemehl/segemehl.x /usr/bin/segemehl.x sudo cp segemehl_0_2_0/segemehl/lack.x /usr/bin/lack.x

But it did not work (it says that the directory or item does not exist). Could that be the problem? As I have checked, it is the only different thing that I have with respect to the instructions.

Thanks a lot for your patience. Best, Rubén

Tillsa commented 4 years ago

You don't need to the two lines you mentioned after installing segemehl via conda. I guess it's a problem of your system. Maybe try to install READemption on a different system or use a VM. And I still recommend you to have a look at the issue (regarding the limit of open files) I linked above. Best, Till

Rdebar commented 4 years ago

Finally I fixed the readonly thing, found the right location of everything, updated everythng and tried to run the example analysis from the script. At the beginning I got this:

pvc@DamienPC:~/Escritorio/Ruben$ bash run_reademption_tutorial.sh run_reademption_tutorial.sh: línea 5: create_project: order not found run_reademption_tutorial.sh: línea 6: store_environment_variable: order not found run_reademption_tutorial.sh: línea 7: download_fasta: order not found run_reademption_tutorial.sh: línea 8: modify_fasta_header: order not found run_reademption_tutorial.sh: línea 9: download_annotation: order not found run_reademption_tutorial.sh: línea 10: download_reads: order not found

So I deleted those orders from the script and created the analysis file myself. Then I got this, which is the same I get when running it manually:

pvc@DamienPC:~/Escritorio/Ruben$ bash run_reademption_tutorial.sh [E::hts_open_format] Failed to open file "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" : No such file or directory Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 294, in align_reads controller.align_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 81, in align_reads self._align_single_end_reads() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 328, in _align_single_end_reads paired_end=False) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/readaligner.py", line 20, in run_alignment paired_end=paired_end) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/segemehl.py", line 75, in align_reads catch_stdout=False) File "/home/pvc/.local/lib/python3.6/site-packages/pysam/utils.py", line 75, in call stderr)) pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" for reading: No such file or directory\n' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 298, in create_coverage_files controller.create_coverage_files() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 465, in create_coverage_files self._paths.read_alignments_stats_path)] File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/rawstatdata.py", line 24, in read with open(input_file) as input_fh: FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/align/reports_and_stats/stats_data_json/read_alignments_final.json' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 302, in run_gene_wise_quantification controller.quantify_gene_wise() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 602, in quantify_gene_wise self._paths.read_alignments_stats_path)] File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/rawstatdata.py", line 24, in read with open(input_file) as input_fh: FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/align/reports_and_stats/stats_data_json/read_alignments_final.json' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 306, in run_deseq controller.compare_with_deseq() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 735, in compare_with_deseq self._check_deseq_args(arg_libs, conditions) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 761, in _check_deseq_args self._paths.read_alignments_stats_path)] File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/rawstatdata.py", line 24, in read with open(input_file) as input_fh: FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/align/reports_and_stats/stats_data_json/read_alignments_final.json' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 310, in viz_align controller.viz_align() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 788, in viz_align align_viz.read_stat_files() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/vizalign.py", line 22, in read_stat_files with open(self._read_aligner_stats_path) as read_aligner_stats_fh: FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/align/reports_and_stats/stats_data_json/read_alignments_final.json' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 314, in viz_gene_quanti controller.viz_gene_quanti() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 801, in viz_gene_quanti gene_quanti_viz.parse_input_table() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/vizgenequanti.py", line 30, in parse_input_table open(self._gene_wise_quanti_combined_path), delimiter="\t"): FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/gene_quanti/gene_quanti_combined/gene_wise_quantifications_combined.csv' Traceback (most recent call last): File "/usr/local/bin/reademption", line 320, in main() File "/usr/local/bin/reademption", line 284, in main args.func(controller) File "/usr/local/bin/reademption", line 318, in viz_deseq controller.viz_deseq() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/controller.py", line 816, in viz_deseq self._paths.viz_deseq_scatter_plot_path) File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/vizdeseq.py", line 25, in create_scatter_plots conditions = self._extract_condition_names() File "/home/pvc/.local/lib/python3.6/site-packages/reademptionlib/vizdeseq.py", line 75, in _extract_condition_names with open(self._deseq_script_path) as deseq_script_fh: FileNotFoundError: [Errno 2] No such file or directory: 'READemption_analysis/output/deseq/deseq_raw/deseq.R'

In the beginning, I don't know if it should be a problem that it does not find and open an output file (failed to open "READemption_analysis/output/align/alignments/InSPI2_R1_alignments_final.bam" for reading: No such file or directory) that should be created when running it, does that make sense? In the reast of the errors, it seems like it fails to find afew things within the project directory. Might there be any library or accesory that I might be missing? I have checked the limit for the open files and it is 4096, I don't know if that may be enough to run it. About the VM, I tried to run READemption in one some time ago and it could not handle the processing, tha's why I was trying it on a Linux computer.

Thanks a million for your patience. Best, Rubén

deppworld commented 2 years ago

@Tillsa Hi I am facing same issue with my data files but example data set is working fine. I have reproduce things according to the example data set instructions but still getting the error: pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open ".//output/align/alignments/SRR10957259_alignments_final.bam" for reading: No such file or directory\n' My terminal process is as follows: (base) dverma2@crb-5zx8nd2:~/Desktop/readmpt_fn$ reademption align -q -g -f ./ [SEGEMEHL] Thu Jul 29 14:22:29 2021: reading database sequences. [SEGEMEHL] Thu Jul 29 14:22:49 2021: 61 database sequences found. [SEGEMEHL] Thu Jul 29 14:22:49 2021: total length of db sequences: 2728222451 [SEGEMEHL] Thu Jul 29 14:22:49 2021: assigning all reads to default read group 'A1'. [SEGEMEHL] Thu Jul 29 14:22:49 2021: additional read group default values ' SM:sample1 LB:library1 PU:unit1 PL:illumina' [SEGEMEHL] Thu Jul 29 14:22:49 2021: reads assigned to read group 'A1' [SEGEMEHL] Thu Jul 29 14:22:49 2021: compiled sam header. [SEGEMEHL] Thu Jul 29 14:22:54 2021: reading queries in './/output/align/processed_reads/SRR10957259_processed.fa.gz'. [SEGEMEHL] Thu Jul 29 14:24:19 2021: 40210880 query sequences found. [SEGEMEHL] Thu Jul 29 14:24:19 2021: reading database sequences. [SEGEMEHL] Thu Jul 29 14:24:40 2021: 61 database sequences found. [SEGEMEHL] Thu Jul 29 14:24:40 2021: total length of db sequences: 2728222451 [SEGEMEHL] Thu Jul 29 14:24:40 2021: assigning all reads to default read group 'A1'. [SEGEMEHL] Thu Jul 29 14:24:40 2021: additional read group default values ' SM:sample1 LB:library1 PU:unit1 PL:illumina' [SEGEMEHL] Thu Jul 29 14:24:40 2021: reads assigned to read group 'A1' [SEGEMEHL] Thu Jul 29 14:24:40 2021: compiled sam header. [SEGEMEHL] Thu Jul 29 14:24:40 2021: reading suffix array './/output/align/index/index.idx' from disk. [E::hts_open_format] Failed to open file ".//output/align/alignments/SRR10957259_alignments_final.bam" : No such file or directory Traceback (most recent call last): File "/home/WIN/dverma2/anaconda3/bin/reademption", line 315, in main() File "/home/WIN/dverma2/anaconda3/bin/reademption", line 22, in main args.func(controller) File "/home/WIN/dverma2/anaconda3/bin/reademption", line 288, in align_reads controller.align_reads() File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/controller.py", line 87, in align_reads self._align_single_end_reads() File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/controller.py", line 385, in _align_single_end_reads read_aligner.run_alignment( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/readaligner.py", line 26, in run_alignment self.segemehl.align_reads( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/segemehl.py", line 95, in align_reads pysam.view( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/pysam/utils.py", line 69, in call raise SamtoolsError( pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open ".//output/align/alignments/SRR10957259_alignments_final.bam" for reading: No such file or directory\n'

Version for supporting tools:

Program: samtools (Tools for alignments in the SAM format) Version: 1.10 (using htslib 1.13)

READemption version: 1.0.1 Python version: 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 10.3.0] Biopython version: 1.79 pysam version: 0.16.0.1 matplotlib version: 3.3.0 pandas version: 1.3.1 segemehl-0.3.4

log of read_processing.json { "SRR10957259": { "total_no_of_reads": 40210880, "polya_removed": 0, "single_a_removed": 0, "unmodified": 40210880, "too_short": 0, "long_enough": 40210880, "read_length_before_processing_and_freq": { "76": 40210880 }, "read_length_after_processing_and_freq": { "76": 40210880 } } }

dverma2@crb-5zx8nd2:~/Desktop/readmpt_fn/output/align/processed_reads$ head SRR10957259_processed.fa

SRR10957259.1 1 length=76 GGCAANTCTCAGACAGCAGGGCTTCTACTGGTCTTTCAGATCCTTCAGTCTTCTNNTGGCAGACTTCANTGTGACT SRR10957259.2 2 length=76 GTCAGNAGCACGACTTGATCTTCGGGGGCAATGCCTTCCAGGGAGGCCACATGANNTTTGATCTGGGCNACCGTCT SRR10957259.3 3 length=76 GTCCGNAGTACACAATTTCCCCGGATGACTTCTTCATCTTCTTCAGCTGTGACANNAAGTACCAGAAGNGGGACTT SRR10957259.4 4 length=76 CTCAGNTTCCAGTTCTTGCTTCATCTTGGCAAACTCTTCTTTTGTCATAGATCCTNCCCCTTCTCCCAGTTTCTGC SRR10957259.5 5 length=76 CCAGCNGCAAGATTAACGCAACCTTCGAGCTTCTCTTTCTGACTCCAATAGGGTGNGCACGTCACCCTCTCGAACG

Tillsa commented 2 years ago

Hi @deppworld,

I just saw that your align command uses the option "-q" which is only required for FASTQ files. Your reads are in FASTA format. If you remove the option "-q" from your command it should work.

Best wishes,

Till

deppworld commented 2 years ago

Hi Till I tried with both the ways(with or without -q) but still getting same error. This is the mouse RNA-seq data and files are not that much big for any hardware issue. Please see and help me to troubleshoot it. I thought it may be a samtools error but samtools is working fine.

(base) dverma2@crb-5zx8nd2:~/Desktop$ reademption align -f readmpt_fn/ [E::hts_open_format] Failed to open file "readmpt_fn//output/align/alignments/SRR10957259_alignments_final.bam" : No such file or directory Traceback (most recent call last): File "/home/WIN/dverma2/anaconda3/bin/reademption", line 315, in main() File "/home/WIN/dverma2/anaconda3/bin/reademption", line 22, in main args.func(controller) File "/home/WIN/dverma2/anaconda3/bin/reademption", line 288, in align_reads controller.align_reads() File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/controller.py", line 87, in align_reads self._align_single_end_reads() File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/controller.py", line 385, in _align_single_end_reads read_aligner.run_alignment( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/readaligner.py", line 26, in run_alignment self.segemehl.align_reads( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/reademptionlib/segemehl.py", line 95, in align_reads pysam.view( File "/home/WIN/dverma2/anaconda3/lib/python3.8/site-packages/pysam/utils.py", line 69, in call raise SamtoolsError( pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open "readmpt_fn//output/align/alignments/SRR10957259_alignments_final.bam" for reading: No such file or directory\n'

Tillsa commented 2 years ago

Can you please try:

$ reademption align -f readmpt_fn

That is without the "/" at the end of the project path

Tillsa commented 2 years ago

Did you try the tutorial? If that worked well, we can rule out problems due to your system and the installed packages. Another thing that I noticed, is that your read files in FASTA don't have ">" as headers. You could create a very small dummy file with 10 reads and add ">" in front of each header line and see if that might solve the problem.

deppworld commented 2 years ago

Hi Tillsa Thanks for your prompt reply. I have followed your both the above mentioned suggestion ; this time I created a demo file in FASTA file and did the same. but still getting same error. I have also checked samtools by using separately. Still I could not rectify the bam file error. I have compared with test result folder and found no index file and no bam file was there.

eadmpt_fn/ ├── input │ ├── annotations │ │ └── SRR10957259.gff3 │ ├── reads │ │ └── SRR10957259.fa │ └── reference_sequences │ └── SRR10957259.fa └── output ├── align │ ├── alignments │ ├── index │ ├── processed_reads │ │ └── SRR10957259_processed.fa.gz │ ├── reports_and_stats │ │ ├── stats_data_json │ │ │ └── read_processing.json │ │ └── version_log.txt │ └── unaligned_reads │ └── SRR10957259_unaligned.fa ├── coverage │ ├── coverage-raw │ ├── coverage-tnoar_mil_normalized │ └── coverage-tnoar_min_normalized ├── deseq │ ├── deseq_raw │ └── deseq_with_annotations ├── gene_quanti │ ├── gene_quanti_combined │ └── gene_quanti_per_lib ├── viz_align ├── viz_deseq └── viz_gene_quanti

Please suggest

Thanks

Tillsa commented 2 years ago

Did you try the tutorial?

deppworld commented 2 years ago

Yes, I have done with the tutorial and test samples and it was working fine but this error is coming when I am using my samples. Would it be possible if you can check this on your system with SRR10957259?

READemption_analysis/ ├── input │ ├── annotations │ │ ├── NC_016810.gff │ │ ├── NC_017718.gff │ │ ├── NC_017719.gff │ │ └── NC_017720.gff │ ├── reads │ │ ├── InSPI2_R1.fa.bz2 │ │ ├── InSPI2_R2.fa.bz2 │ │ ├── LSP_R1.fa.bz2 │ │ └── LSP_R2.fa.bz2 │ └── reference_sequences │ ├── NC_016810.fa │ ├── NC_017718.fa │ ├── NC_017719.fa │ └── NC_017720.fa └── output ├── align │ ├── alignments │ │ ├── InSPI2_R1_alignments_final.bam │ │ ├── InSPI2_R1_alignments_final.bam.bai │ │ ├── InSPI2_R2_alignments_final.bam │ │ ├── InSPI2_R2_alignments_final.bam.bai │ │ ├── LSP_R1_alignments_final.bam │ │ ├── LSP_R1_alignments_final.bam.bai │ │ ├── LSP_R2_alignments_final.bam │ │ └── LSP_R2_alignments_final.bam.bai │ ├── index │ │ └── index.idx │ ├── processed_reads │ │ ├── InSPI2_R1_processed.fa.gz │ │ ├── InSPI2_R2_processed.fa.gz │ │ ├── LSP_R1_processed.fa.gz │ │ └── LSP_R2_processed.fa.gz │ ├── reports_and_stats │ │ ├── read_alignment_stats.csv │ │ ├── stats_data_json │ │ │ ├── read_alignments_final.json │ │ │ └── read_processing.json │ │ └── version_log.txt │ └── unaligned_reads │ ├── InSPI2_R1_unaligned.fa │ ├── InSPI2_R2_unaligned.fa │ ├── LSP_R1_unaligned.fa │ └── LSP_R2_unaligned.fa ├── coverage │ ├── coverage-raw │ ├── coverage-tnoar_mil_normalized │ └── coverage-tnoar_min_normalized ├── deseq │ ├── deseq_raw │ └── deseq_with_annotations ├── gene_quanti │ ├── gene_quanti_combined │ └── gene_quanti_per_lib ├── viz_align ├── viz_deseq └── viz_gene_quanti

In my case,I am not getting bam and index file.

Thanks Deepak

Tillsa commented 2 years ago

Did you try the tutorial? If that worked well, we can rule out problems due to your system and the installed packages. Another thing that I noticed, is that your read files in FASTA don't have ">" as headers. You could create a very small dummy file with 10 reads and add ">" in front of each header line and see if that might solve the problem.

And did you fix the fasta headers?

deppworld commented 2 years ago

Yes (base) dverma2@crb-5zx8nd2:~/Desktop/readmpt_fn/input/reads$ head SRR10957259.fa

SRR10957259.1 1 length=76 GGCAANTCTCAGACAGCAGGGCTTCTACTGGTCTTTCAGATCCTTCAGTCTTCTNNTGGCAGACTTCANTGTGACT SRR10957259.2 2 length=76 GTCAGNAGCACGACTTGATCTTCGGGGGCAATGCCTTCCAGGGAGGCCACATGANNTTTGATCTGGGCNACCGTCT SRR10957259.3 3 length=76 GTCCGNAGTACACAATTTCCCCGGATGACTTCTTCATCTTCTTCAGCTGTGACANNAAGTACCAGAAGNGGGACTT SRR10957259.4 4 length=76 CTCAGNTTCCAGTTCTTGCTTCATCTTGGCAAACTCTTCTTTTGTCATAGATCCTNCCCCTTCTCCCAGTTTCTGC SRR10957259.5 5 length=76 CCAGCNGCAAGATTAACGCAACCTTCGAGCTTCTCTTTCTGACTCCAATAGGGTGNGCACGTCACCCTCTCGAACG

I am thinking to reinstall the package and rerun the pipeline. Hope It may resolve the issue.

Tillsa commented 2 years ago

You need to add ">" in front of each header line to get a Fasta file that looks like this:

>SRR10957259.1 1 length=76
GGCAANTCTCAGACAGCAGGGCTTCTACTGGTCTTTCAGATCCTTCAGTCTTCTNNTGGCAGACTTCANTGTGACT
>SRR10957259.2 2 length=76
GTCAGNAGCACGACTTGATCTTCGGGGGCAATGCCTTCCAGGGAGGCCACATGANNTTTGATCTGGGCNACCGTCT
>SRR10957259.3 3 length=76
GTCCGNAGTACACAATTTCCCCGGATGACTTCTTCATCTTCTTCAGCTGTGACANNAAGTACCAGAAGNGGGACTT
>SRR10957259.4 4 length=76
CTCAGNTTCCAGTTCTTGCTTCATCTTGGCAAACTCTTCTTTTGTCATAGATCCTNCCCCTTCTCCCAGTTTCTGC
>SRR10957259.5 5 length=76
CCAGCNGCAAGATTAACGCAACCTTCGAGCTTCTCTTTCTGACTCCAATAGGGTGNGCACGTCACCCTCTCGAACG

deppworld commented 2 years ago

Ya this is there but due to webpage settings it is disappeared while I am copying it here. Can you tell me what is the minimum hardware configuration is required for this pipeline like I am using 8 cores and 32 GB RAM system. This is the complete file, I used your tutorial command to create it.

SRR10957259.1 1 length=76 GGCAANTCTCAGACAGCAGGGCTTCTACTGGTCTTTCAGATCCTTCAGTCTTCTNNTGGCAGACTTCANTGTGACT SRR10957259.2 2 length=76 GTCAGNAGCACGACTTGATCTTCGGGGGCAATGCCTTCCAGGGAGGCCACATGANNTTTGATCTGGGCNACCGTCT SRR10957259.3 3 length=76 GTCCGNAGTACACAATTTCCCCGGATGACTTCTTCATCTTCTTCAGCTGTGACANNAAGTACCAGAAGNGGGACTT SRR10957259.4 4 length=76 CTCAGNTTCCAGTTCTTGCTTCATCTTGGCAAACTCTTCTTTTGTCATAGATCCTNCCCCTTCTCCCAGTTTCTGC SRR10957259.5 5 length=76 CCAGCNGCAAGATTAACGCAACCTTCGAGCTTCTCTTTCTGACTCCAATAGGGTGNGCACGTCACCCTCTCGAACG SRR10957259.6 6 length=76 ATATGNGCATCTCCAGTCTCCACTGTCAACTGTGAGTTGATGGCCTCAAAGCTGGNGTTCTCCAATAGCTTCATGT SRR10957259.7 7 length=76 GCCACNCTGGCACATGAATCCTGGAATAATTCTGTGAAAGGAGGAACCCTTATAGCCAAATCCTTTCTCTCCAGTG SRR10957259.8 8 length=76 CTCTTNTCCAAGTGCAGTGCACACTCCATTGCATTCAGCCCGCTCTCCCAGTCATCACGGTCTGGTTTCTTTATAT SRR10957259.9 9 length=76 CGGGAATGGACAGTCACAGGCTTGCGGATGATCAGCCCATCCTTGATCAGCTTCCTGATCTGCTGACGGGAGTTGG SRR10957259.10 10 length=76 GTGCTTAATCTGCTCTGCAGCTCCAGTCATAAAAGGCTTTACTCTTTCTGGTTTCTGCTCTTCAAGTTTGCCTTTG

Tillsa commented 2 years ago

8 Cores and 32 GB RAM is fine. If the tutorial doesn't raise an error but running READemption with your input files (reads and reference sequences in FASTA) raises an error, I assume that your input files are corrupted or don't meet the FASTA specifications. You could try to validate them here: https://plabipd.de/portal/mercator-fasta-validator

kmuench commented 10 months ago

Hey - I wanted to report that I'm getting the SamtoolsError( pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=None, stderr=samtools view: failed to open "READemption_analysis/output/align/alignments/file.bam" for reading: No such file or directory\n' using the tutorial data and the latest docker image. I haven't solved it yet. I notice the installed version of segemehl in the image is 0.2.0-418.

Tillsa commented 10 months ago

Hello @kmuench,

Could you please provide the stdout messages of the entire analysis? Did you run the tutorial from https://github.com/Tillsa/READemption_Docker_Tutorial via $ bash run_tutorial.sh all

?

foerstner-lab / READemption

Error running align #28

Hi everyone! I'm trying to run READemption for the first time on Ubuntu with the example data (also, I'm not an expert in bioinformatics) and I'm getting this error:

Cool, then segemehl mus tbe in place now. Now a new error message when running it: