mvolar / SatXplor

A satDNA exploration pipeline
MIT License
1 stars 0 forks source link

Error running the test: run_full_tests.py #2

Closed JosemaRico closed 4 weeks ago

JosemaRico commented 4 weeks ago

Hello, I think I have successfully installed the SatXplor tool, I activate the python and mamba virtual environments and run: python /tests/tests.py, and the result seems correct: Running installation tests. R script executed successfully. R is setup properly. MAFFT is installed. NCBI BLAST is installed.

Next I run: python satxplor/run_full_tests.py. Everything seems to start running correctly: 2024-08-19 10:29:01,387 - INFO - Setting up testing data environment. 2024-08-19 10:29:01,387 - INFO - Extracting data. /usr/local/SatXplor/satxplor/run_full_tests.py:12: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior. tar.extract(member, path=extract_dir) 2024-08-19 10:29:01,654 - INFO - Starting the test run. 2024-08-19 10:29:02,006 - INFO - Python Version: 2024-08-19 10:29:02,006 - INFO - 3.12.5 | packaged by conda-forge | (main, Aug 8 2024, 18:36:51) [GCC 12.4.0] 2024-08-19 10:29:02,007 - INFO - {'INPUT_GENOME_PATH': './testing_data/test_sequence.fasta', 'GENOME_PATH': './input.fasta', 'SAT_RAW': './testing_data/test_sats.fasta', 'SAT_FASTA_PATH': './sats.fasta', 'FINAL_RESULTS_DIR': './test_output/', 'OVERWRITE': True} 2024-08-19 10:29:02,007 - INFO - The output folder already exists 2024-08-19 10:29:02,007 - INFO - Removing data in existing results and tmp directories. 2024-08-19 10:29:02,010 - INFO - Data deleted and folders created. 2024-08-19 10:29:02,010 - INFO - Running preprocessing script for the input. Removing all sequences shorter than 100000. 2024-08-19 10:29:02,217 - INFO - Sanitizing input names 2024-08-19 10:29:02,218 - INFO - Output sequences written to ./input.fasta Blast command:['python3', './satxplor/blast.py', './sats.fasta', './input.fasta'] Namespace(sat_path='./sats.fasta', genome_path='./input.fasta')

Building a new DB, current time: 08/19/2024 10:29:02 New DB name: /usr/local/SatXplor/input.fasta New DB title: ./input.fasta Sequence type: Nucleotide Keep MBits: T Maximum file size: 3000000000B Adding sequences from FASTA; added 1 sequences in 0.280773 seconds.

removing databse file:././input.fasta.ntf removing databse file:././input.fasta.not removing databse file:././input.fasta.njs removing databse file:././input.fasta.nin removing databse file:././input.fasta.nto removing databse file:././input.fasta.nhr removing databse file:././input.fasta.nsq removing databse file:././input.fasta.ndb 2024-08-19 10:29:05,588 - INFO - Finding HORs in BLAST output: ./results/blast_output.tsv 2024-08-19 10:29:05,653 - INFO - Monomer statistics: . . . Until you get these errors:

2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast6_flanks_aligned.fasta 2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast1_flanks_aligned.fasta 2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast5_flanks_aligned.fasta 2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast2_flanks_aligned.fasta 2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast3_flanks_aligned.fasta 2024-08-19 10:29:15,346 - INFO - Running flank alignment ./results/data/sequences/flanks/Cast8_flanks_aligned.fasta Error executing R script for ./results/data/sequences/flanks/Cast8_flanks_aligned.fasta. Return code: 2 STDERR:

Error executing R script for ./results/data/sequences/flanks/Cast6_flanks_aligned.fasta. Return code: 2 Error executing R script for ./results/data/sequences/flanks/Cast2_flanks_aligned.fasta. Return code: 2 Error executing R script for ./results/data/sequences/flanks/Cast3_flanks_aligned.fasta. Return code: 2 Error executing R script for ./results/data/sequences/flanks/Cast1_flanks_aligned.fasta. Return code: 2 STDERR: STDERR:

STDERR: STDERR:

Error executing R script for ./results/data/sequences/flanks/Cast5_flanks_aligned.fasta. Return code: 2 STDERR:

2024-08-19 10:29:15,423 - INFO - Running PCA umap for ./results/data/sequences/Cast2_monomers_aligned.fasta 2024-08-19 10:29:15,423 - INFO - Running PCA umap for ./results/data/sequences/Cast3_monomers_aligned.fasta 2024-08-19 10:29:15,423 - INFO - Running PCA umap for ./results/data/sequences/Cast6_monomers_aligned.fasta 2024-08-19 10:29:15,423 - INFO - Running PCA umap for ./results/data/sequences/Cast5_monomers_aligned.fasta 2024-08-19 10:29:15,423 - INFO - Running PCA umap for ./results/data/sequences/Cast8_monomers_aligned.fasta 2024-08-19 10:29:15,424 - INFO - Running PCA umap for ./results/data/sequences/Cast1_monomers_aligned.fasta --- Logging error --- --- Logging error --- --- Logging error --- --- Logging error --- --- Logging error --- Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): --- Logging error --- File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 1160, in emit msg = self.format(record) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 999, in format return fmt.format(record) ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 1160, in emit msg = self.format(record) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 1160, in emit msg = self.format(record) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 703, in format record.message = record.getMessage() ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 999, in format return fmt.format(record) ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 999, in format return fmt.format(record) ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/myenv/lib/python3.12/logging/init.py", line 392, in getMessage msg = msg % self.args


  File "/root/miniconda3/envs/myenv/lib/python3.12/logging/__init__.py", line 703, in format
    record.message = record.getMessage()
                     ^^^^^^^^^^^^^^^^^^^
.
.
.
It keeps giving errors until it reaches:

2024-08-19 10:29:15,501 - INFO - Running finished at 2024-08-19 10:29:15.501491, deleting the checkpoints file. Copying results file to output directory ./test_output/

I am not able to find the bug, it seems that it is a bug when calling the R scripts ... Thanks you!
JosemaRico commented 4 weeks ago

Ok, I just found the bug, in the script r_script_runners.py there are several errors: Always where it says " command = ['Rscript', './eusatxplor/r/pca_umap_plots.R',"

There is an error in the path name, the eusatxplor folder does not exist, it must be changed to satxplor, leaving it like this:

command = ['Rscript', './satxplor/r/pca_umap_plots.R',

mvolar commented 4 weeks ago

Hello,

thanks for the patience, we changed the name of the program recently, I forgot to push the changes to path of subdirectories. I will rerun the tests and have pushed the new changes.

You can also use the docker image.

JosemaRico commented 4 weeks ago

Thank you very much, yes, in the end, I decided to use Docker.