ConesaLab / SQANTI3

Tool for the Quality Control of Long-Read Defined Transcriptomes
GNU General Public License v3.0
198 stars 49 forks source link

[BUG] Incomplete SQANTI3_filter output #335

Closed zhangkn3 closed 1 month ago

zhangkn3 commented 1 month ago

Is there an existing issue for this?

Have you loaded the SQANTI3.env conda environment?

Problem description

what you tried to achieve: I am trying to run the sqanti3 flow to classify the novel transcripts identified by "Bambu" from long-read RNA sequencing data.

how you went about it (referring to the code sample): The QC step succeeded, and I am trying to filter the identified transcripts.

why the current behaviour is a problem and what output you expected instead. I got incomplete results and cannot run the SQANTI3_rescue for the further step.

Code sample

!/bin/bash

echo "Starting sqanti3 filter" source activate /shared/data/conda_env/SQANTI3.env export TMPDIR='/shared/data/knz_tmp'

Test/SQANTI3-5.2.2/sqanti3_filter.py rules /shared/data/bmb_espresso/7_SQANTI3/test_classification.txt \ -o rules_filter \ -d /shared/data/bmb_espresso/7_SQANTI3/rules_filter

Error

Starting sqanti3 filter /cm/local/apps/slurm/var/spool/job07972/slurm_script: line 6: activate: No such file or directory /shared/home/kz116/Test/SQANTI3-5.2.2/sqanti3_filter.py:33: DeprecationWarning: Use shutil.which instead of find_executable RSCRIPTPATH = distutils.spawn.find_executable('Rscript') Rscript (R) version 4.3.3 (2024-02-29)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

 Reading classification file


 Reading JSON file with rules to filter


 Performing filtering


 Writting results


 SQANTI3 Rules filter report

Loading required package: magrittr

Reading Rules result classification table... Warning message: There were 2 warnings in dplyr::mutate(). The first warning was: ℹ In argument: structural_category =%>%(...). Caused by warning: ! Unknown levels in f: incomplete-splice_match ℹ Run dplyr::last_dplyr_warnings() to see the 1 remaining warning. Loading required package: ggplot2 trying URL 'https://cran.rstudio.com/src/contrib/RColorConesa_1.0.0.tar.gz' Content type 'application/x-gzip' length 6576 bytes

downloaded 6576 bytes

The downloaded source packages are in ‘/shared/data/knz_tmp/RtmpE4AswB/downloaded_packages’

Generating common filter plots...

summarise() has grouped output by 'structural_category'. You can override using the .groups argument. summarise() has grouped output by 'filter'. You can override using the .groups argument. summarise() has grouped output by 'filter'. You can override using the .groups argument. Warning message: There was 1 warning in dplyr::mutate(). ℹ In argument: filter = factor(filter) %>% forcats::fct_relevel(c("Before", "After")). Caused by warning: ! 2 unknown levels in f: Before and After summarise() has grouped output by 'filter'. You can override using the .groups argument. Error in dplyr::mutate(): ℹ In argument: ism_type =%>%(...). Caused by error in forcats::fct_relevel(): ! .f must be a factor or character vector, not an empty logical vector. Backtrace: ▆

  1. ├─... %>% ...
  2. ├─dplyr::mutate(...)
  3. ├─dplyr:::mutate.data.frame(...)
  4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
  5. │ ├─base::withCallingHandlers(...)
  6. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
  7. │ └─mask$eval_all_mutate(quo)
  8. │ └─dplyr (local) eval()
  9. ├─... %>% forcats::fct_relevel(c("Unique", "Multiple"))
    1. └─forcats::fct_relevel(., c("Unique", "Multiple"))
    2. └─forcats:::check_factor(.f)
    3. └─cli::cli_abort(...)
    4. └─rlang::abort(...) Execution halted Write arguments to /shared/data/bmb_espresso/7_SQANTI3/rules_filter/rules_filter_params.txt...

Running SQANTI3 filtering...

/shared/data/conda_env/SQANTI3.env/bin/Rscript /shared/home/kz116/Test/SQANTI3-5.2.2/utilities/filter/SQANTI3_rules_filter.R -c /shared/data/projects/lrRNA_neoAg/bmb_espresso/7_SQANTI3/test_classification.txt -o rules_filter -d /shared/data/bmb_espresso/7_SQANTI3/rules_filter -j /shared/home/kz116/Test/SQANTI3-5.2.2/utilities/filter/filter_default.json -u /shared/home/kz116/Test/SQANTI3-5.2.2/utilities -e False

Anything else?

No response

carolinamonzo commented 1 month ago

Hi @zhangkn3,

First of all, thank you for using SQANTI3 :)

Looks like the error is when it's trying to read a file. Since you are running SQANTI3 from the directory where you have your files, could you try running the full command like I'm writing below? And also check that "/shared/data/bmb_espresso/7_SQANTI3/test_classification.txt" exists and has information inside.

#!/bin/bash
echo "Starting sqanti3 filter"
source activate /shared/data/conda_env/SQANTI3.env
export TMPDIR='/shared/data/knz_tmp'

python3 Test/SQANTI3-5.2.2/sqanti3_filter.py rules /shared/data/bmb_espresso/7_SQANTI3/test_classification.txt -o rules_filter -d /shared/data/bmb_espresso/7_SQANTI3/rules_filter --json_filter Test/SQANTI3-5.2.2/utilities/filter/filter_default.json

Best, Carol.

zhangkn3 commented 1 month ago

Hi Carolina,

Happy to hear from you!

I rerun the codes provided by you as follows: image

And received the same errors. image

The environment is the same as the SQANTI3.env, I just modified the environment name in the .yml file.

zhangkn3 commented 1 month ago

And here are the incomplete output files I received. image

alexpan00 commented 1 month ago

Hi @zhangkn3 and thank you for using SQANTI3,

The issue is that bambu does not report any ISM transcript, at least in my experience, and based on your error that seems to be also your case.

The error you are getting is in the report of the filter step, so the outputs that you get should be fine. To double check if that is the case, you can run SQANTI3 filter with option --skip_report.

If you want some extra outputs from the filter, like the filtered gtf, you need to provide that as an input with flag --gtf.

Alejandro.

zhangkn3 commented 1 month ago

Hi Alejandro, Hi Carol,

Thanks for your detailed explanation.

I reran the code with flags --gtf and --skip_report, and got the processed results without the report. It is ok, I could analyze them by myself.

Thanks again for all your effort in the excellent pipeline!

Best, Kenan