Closed Upendra19993 closed 7 months ago
Hi @Upendra19993, Unfortunately, it seems you were missing a simple R package for coloring the resulting plots. When you installed SQANTI3, did you install the conda environment "SQANTI.env"? If you install the environment, it makes all the installations to the correct versions needed by the package (You can check how to do this in the SQANTI3 documentation: https://github.com/ConesaLab/SQANTI3/wiki/Dependencies-and-installation#2-creating-the-conda-environment )
It seems you are running SQANTI3 from your "base" environment. You should either install the SQANTI3 environment and then run "conda activate SQANTI.env", or install the RColorConesa package (https://cran.r-project.org/web/packages/RColorConesa/index.html ) on your base environment.
Hi carolinamonzo,
No, I didn't install conda environment "SQANTI.env when installed sqanti3.
But now I installed sqanti3 in conda environment "SQANTI.env and ran the filtering step. I didn't get the previous error of missing RColorConesa package, but got warning messages regarding accessing the CRAN to install or load packages and ggplot2. The message is as below.
Warning message:
There were 2 warnings in dplyr::mutate()
.
The first warning was:
ℹ In argument: structural_category =
%>%(...)
.
Caused by warning:
! Unknown levels in f
: genic_intron
ℹ Run dplyr::last_dplyr_warnings()
to see the 1 remaining warning.
Loading required package: ggplot2
Warning message:
package ‘ggplot2’ was built under R version 4.3.2
Error in contrib.url(repos, type) :
trying to use CRAN without setting a mirror
Calls: suppressMessages ... withCallingHandlers -> install.packages -> startsWith -> contrib.url
Execution halted
(SQANTI3.env) [uqwwijes@bunya3 SQANTI3-5.2]$
I get all the output files, but not sure whether they are accurate due to warning messages I get. Could you please have a look and suggest on how to proceed to resolve this issue.
Many thanks, Upendra.
Hi @Upendra19993 the warnings are not worrisome. I'll update the SQANTI installation steps so the warning doesn't appear. In your case, it installed from the cloud since the source wasn't specified. You can go ahead and continue with your analysis, the warnings you found have not affected your data.
Best, Carolina.
Many thanks, carolinamonzo!
I met the same problem, it's seems like it were not been fixed in the newest SQANTI3-5.2?
@CaiCheng1996 file.edit(".Rprofile") options(repos = c(CRAN = "https://cloud.r-project.org")) Try this. That did the trick, amazing.
Hi all,
I am running sqanti3. I started with the example dataset you have provided. When running the filtering step, I am getting an error and warning messages. Kindly request to have a look and assist me in resolving this issue. I have copied the complete message for your reference nd the error messages are found at the end.
(base) [uqwwijes@bun101 SQANTI3_output_original_names_after_reinstallation2]$ sqanti3_filter.py ml UHR_chr22_classification.txt Rscript (R) version 4.3.1 (2023-06-16) Output directory not defined. All the outputs will be stored at /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2 directory Output name not defined. All the outputs will have the prefix UHR_chr22 Write arguments to /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2/UHR_chr22_params.txt...
Running SQANTI3 filtering...
/sw/local/rocky8/noarch/qcif/software/miniconda3/envs/sqanti3_5.2/bin/Rscript /sw/local/rocky8/noarch/qcif/software/SQANTI3-5.2/utilities/filter/SQANTI3_MLfilter.R -c /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2/UHR_chr22_classification.txt -o UHR_chr22 -d /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2 -t 0.8 -j 0.7 -i 60 -f False -e False -m False -z 3000
CURRENT ML FILTER PARAMETERS:
[1] "sqanti_classif: /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2/UHR_chr22_classification.txt" [2] "output: UHR_chr22" [3] "dir: /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output_original_names_after_reinstallation2" [4] "percent_training: 0.8" [5] "threshold: 0.7" [6] "intrapriming: 60" [7] "force_fsm_in: FALSE" [8] "force_multi_exon: FALSE" [9] "intermediate_files: FALSE" [10] "max_class_size: 3000" [11] "help: FALSE"
Reading SQANTI3 *_classification.txt file...
Checking data for mono and multi-exon transcripts...
Checking input data for True Positive (TP) and True Negative (TN) sets...
Using Novel Not In Catalog non-canonical isoforms as True Negatives for training.
Not enough (< 250) Reference Match transcript isoforms among FSM, all FSM transcripts will be used as Positive set.
Balancing number of isoforms in TP and TN sets...
Wrote generated TP and TN lists to files:
Aggregating FL counts across samples (if more than one sample is provided)...
Replacing NAs with appropriate values for ML...
Handling factor columns...
Handling integer columns...
Removing variables with near-zero variance... Removed columns: [1] "chrom" "RTS_stage" "n_indels" [4] "n_indels_junc" "dist_to_CAGE_peak" "within_CAGE_peak" [7] "dist_to_polyA_site" "within_polyA_site" "polyA_dist"
Removing highly correlated features... (correlation threshold = 0.9).
All correlations <= 0.9
Creating positive and negative sets for classifier training and testing...
Finished creating training data set.
Partitioning data into training and test sets...
Description of the training set:
full-splice_match novel_not_in_catalog 231 231
full-splice_match novel_not_in_catalog 57 57
Training Random Forest Classifier...
Pre-defined Random Forest parameters (supplied to caret::trainControl()):
Loading required package: ggplot2 Loading required package: lattice
Random forest training finished.
Saved generated classifier to randomforest.RData file.
Random forest evaluation: applying classifier to test set...
Test set evaluation results:
AUC, Sensitivity and Specificity on test set: ROC Sens Spec 0.9713758 0.7719298 0.9824561
Writing summary to testSet_summary.txt file.
Confusion matrix: Reference Prediction POS NEG POS 44 1 NEG 13 56
Writing confusion matrix and statistics to output files: testSet_confusionMatrix.txt testSet_stats.txt
Global variable importance in Random Forest classifier: Overall min_cov 35.7541546 min_sample_cov 35.5178859 bite 27.7809271 gene_exp 18.3663935 sd_cov 17.7682316 predicted_NMD 14.7867100 iso_exp 10.6466531 diff_to_gene_TSS 9.1437091 ratio_TSS 8.2086366 FSM_class 7.5753832 length 6.9421819 diff_to_gene_TTS 5.9650142 exons 5.8164596 perc_A_downstream_TTS 5.2488334 ratio_exp 0.6994311 coding 0.5164620
Variable importance table saved as classifier_variable-importance_table.txt
Calculating and printing test set ROC curves... Setting levels: control = 1, case = 2 Setting direction: controls > cases Setting levels: control = 1, case = 2 Setting direction: controls < cases
ROC curves saved to testSet_ROC_curve.pdf file. Includes:
Applying Random Forest classifier to input dataset...
Random forest prediction finished successfully!
Random forest classification results:
Negative Positive 1648 1690 Warning message: package ‘ggplot2’ was built under R version 4.3.2
Applying intra-priming filter to our dataset.
Intra-priming filtered transcripts:
FALSE TRUE 3213 712
Writing filter results to classification file...
SUMMARY OF MACHINE LEARNING + INTRA-PRIMING FILTERS:
Artifact Isoform 2177 1748
SQANTI3 ML filter finished successfully!
Loading required package: magrittr
Reading ML result classification table...
Reading classifier variable importance table... Rows: 16 Columns: 2 ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (1): variable dbl (1): importance
ℹ Use
spec()
to retrieve the full column specification for this data. ℹ Specify the column types or setshow_col_types = FALSE
to quiet this message.Reading ML filter parameters... Rows: 53 Columns: 2 ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (2): parameter, value
ℹ Use
spec()
to retrieve the full column specification for this data. ℹ Specify the column types or setshow_col_types = FALSE
to quiet this message.Reading ML performance statistics... Rows: 18 Columns: 2 ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (1): metric dbl (1): value
ℹ Use
spec()
to retrieve the full column specification for this data. ℹ Specify the column types or setshow_col_types = FALSE
to quiet this message. Rows: 4 Columns: 3 ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (2): Prediction, Reference dbl (1): Freqℹ Use
spec()
to retrieve the full column specification for this data. ℹ Specify the column types or setshow_col_types = FALSE
to quiet this message. Warning message: There were 2 warnings indplyr::mutate()
. The first warning was: ℹ In argument:structural_category =
%>%(...)
. Caused by warning: ! Unknown levels inf
: genic_intron ℹ Rundplyr::last_dplyr_warnings()
to see the 1 remaining warning. Loading required package: ggplot2 Warning message: package ‘ggplot2’ was built under R version 4.3.2 Warning in install.packages("RColorConesa") : 'lib = "/sw/local/rocky8.6/noarch/qcif/software/miniconda3/envs/sqanti3_5.2/lib/R/library"' is not writable Error in install.packages("RColorConesa") : unable to install packages Calls: suppressMessages -> withCallingHandlers -> install.packages Execution halted (base) [uqwwijes@bun101 SQANTI3_output_original_names_after_reinstallation2]$Many thanks, Upendra.