Closed kkazi1980 closed 6 months ago
Are there any outputs from previous steps, namely load_fids and raw_fids_visualisation? How manifest.csv and metadata looks like? There error most likely indicates that there are no data after loading raw fids.
You can try. I simply replaced the folders in your testthat/data/ folders with different datasets (from MTBLS9131) and even named them in the same way (500 etc.), keeping your metadata. Of course, it does not make any sense in scientific terms, but I wanted to test the workflow on different signals. MTBLS9131.zip
After downloading the raw data directly via FTP using the command:
wget -r -np -nH --cut-dirs=5 ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS9131/
and then making a little modifications to the metadata.csv, manifest.csv, and params.yml files based on the information provided in UPEC_metadata.xlsx everything seems to work just fine. It is crucial to ensure that only two groups are kept for comparison in metadata. If the pipeline is executed with three groups, it will run successfully until the univariate analysis, where an error will appear indicating that there are more than two groups to compare. I am also including all files (only inputs as full results are still in process of generating
NASQQ_MTBLS9131.zip
) in .zip archives in here.
Here are the results from the spectral processing. Aside from three samples that likely needed manual flipping after FT and some adjustment of baseline correction parameters, everything looks fine. The folder containing all results exceeds 600MB, so I am unable to upload it here.
It took 2 hours, but the results have been fully generated. Additionally, treated samples are clearly distinguished from the negative controls, suggesting that everything proceeded as expected (see next screenshot of PCA)
One outlier was detected in the univariate module. This sample exhibited strong water signal during spectral processing stage, resulting in poor FT and phasing. This outlier is also evident in the PC1 vs PC2 plot and on the heatmap (sample 30), where distinct clusters for the treatment and control groups are clearly visible as well.
After downloading the raw data directly via FTP using the command:
wget -r -np -nH --cut-dirs=5 ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS9131/
and then making a little modifications to the metadata.csv, manifest.csv, and params.yml files based on the information provided in UPEC_metadata.xlsx everything seems to work just fine. It is crucial to ensure that only two groups are kept for comparison in metadata. If the pipeline is executed with three groups, it will run successfully until the univariate analysis, where an error will appear indicating that there are more than two groups to compare. I am also including all files (only inputs as full results are still in process of generating NASQQ_MTBLS9131.zip ) in .zip archives in here.
Hi,
I downloaded your files and run the program. It crashes on Solven t Suppression:
Nextflow 24.04.1 is available - Please consider updating your version to it
N E X T F L O W ~ version 23.10.1
Launching ./main.nf
[gigantic_carson] DSL2 - revision: eee0efc519
.--:--:---.\
{} : {} : __ _ __ __ ____
||__"_|| : | || || \ / || __ \
/ \ `={}_ | N \ A| S Q || Q_| \
| NASQQ | ( ) |_| \_||_| V |_||_| \_|
| v1.0.0 | ( )
| ____ | ( ) Nextflow Automatization and Standarization for Qualitative and Quantitative
| | | | ( ) 1H 1D NMR metabolomics data preparation and analysis
|___|____|_| ( ) =======================================
| | ( ) input from : manifest.csv
/| || |\ ( ) output to : output/outdir_review
| | || | |( ) ------
| |____||____| |( ) run as : nextflow run ./main.nf -c ./nextflow.config -profile standard -params-file params.yml
| _________ |( ) started at : 2024-05-27T09:09:47.060396+02:00
| | | | | |( ) launchdir at : /workspace1/Dropbox/Krzysztof/revisions/GIGAScience/nasqq-main
|__| |_| |_|(___)
[- ] process > SPECTRAL_PREPROCESSING:LOAD... - [- ] process > SPECTRALPREPROCESSING:RAW... - executor > local (1) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [ 0%] 0 of 1 [- ] process > SPECTRALPREPROCESSING:RAW... - executor > local (3) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [ 0%] 0 of 1 executor > local (4) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [ 0%] 0 of 1 executor > local (4) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [ 0%] 0 of 1 [7a/38d866] process > SPECTRAL_PREPROCESSING:GROU... [100%] 1 of 1 ✔ [bf/ee4e0f] process > SPECTRAL_PREPROCESSING:SOLV... [ 50%] 1 of 2, failed: 1... executor > local (5) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [100%] 1 of 1 ✔ [7a/38d866] process > SPECTRAL_PREPROCESSING:GROU... [100%] 1 of 1 ✔ executor > local (5) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [100%] 1 of 1 ✔ [7a/38d866] process > SPECTRAL_PREPROCESSING:GROU... [100%] 1 of 1 ✔ [78/152663] process > SPECTRAL_PREPROCESSING:SOLV... [ 50%] 1 of 2, failed: 1... [- ] process > SPECTRAL_PREPROCESSING:APOD... - [- ] process > SPECTRAL_PREPROCESSING:ZERO... - [- ] process > SPECTRAL_PREPROCESSING:FOUR... - [- ] process > SPECTRAL_PREPROCESSING:ZERO... - [- ] process > SPECTRAL_PREPROCESSING:INTE... - [- ] process > SPECTRAL_PREPROCESSING:BASE... - [- ] process > SPECTRAL_PREPROCESSING:NEGA... - [- ] process > SPECTRAL_PREPROCESSING:WARPING - [- ] process > SPECTRAL_PREPROCESSING:WIND... - [- ] process > SPECTRAL_PREPROCESSING:BUCK... - [- ] process > SPECTRAL_PREPROCESSING:NORM... - [- ] process > METABOLITES_QUANTIFICATION - [- ] process > ADD_METADATA - [- ] process > COMBINE_DATASET_BATCHES - [- ] process > DATA_ANALYSIS:FEATURES_PROC... - [- ] process > DATA_ANALYSIS:EXPLORATORY_D... - [- ] process > DATA_ANALYSIS:UNIVARIATE_AN... - [- ] process > DATAANALYSIS:MULTIVARIATE... - [- ] process > PATHWAY_ANALYSIS_MULTIVARIATE - [- ] process > PATHWAY_ANALYSIS_UNIVARIATE - ERROR ~ Error executing process > 'SPECTRAL_PREPROCESSING:SOLVENT_SUPPRESSION (review)'
Caused by:
Process SPECTRAL_PREPROCESSING:SOLVENT_SUPPRESSION (review)
terminated with an error exit status (1)
Command executed:
solvent_suppresion.R --id "review" --fid_gdc "review_grouped_FIDdata_GDC.rds" --raw_rds "review_selected_fid_list.rds"
executor > local (5) [ff/701c1d] process > SPECTRAL_PREPROCESSING:LOAD... [100%] 1 of 1 ✔ [42/1372e0] process > SPECTRALPREPROCESSING:RAW... [100%] 1 of 1 ✔ [7a/38d866] process > SPECTRAL_PREPROCESSING:GROU... [100%] 1 of 1 ✔ [78/152663] process > SPECTRAL_PREPROCESSING:SOLV... [100%] 2 of 2, failed: 2... [- ] process > SPECTRAL_PREPROCESSING:APOD... - [- ] process > SPECTRAL_PREPROCESSING:ZERO... - [- ] process > SPECTRAL_PREPROCESSING:FOUR... - [- ] process > SPECTRAL_PREPROCESSING:ZERO... - [- ] process > SPECTRAL_PREPROCESSING:INTE... - [- ] process > SPECTRAL_PREPROCESSING:BASE... - [- ] process > SPECTRAL_PREPROCESSING:NEGA... - [- ] process > SPECTRAL_PREPROCESSING:WARPING - [- ] process > SPECTRAL_PREPROCESSING:WIND... - [- ] process > SPECTRAL_PREPROCESSING:BUCK... - [- ] process > SPECTRAL_PREPROCESSING:NORM... - [- ] process > METABOLITES_QUANTIFICATION - [- ] process > ADD_METADATA - [- ] process > COMBINE_DATASET_BATCHES - [- ] process > DATA_ANALYSIS:FEATURES_PROC... - [- ] process > DATA_ANALYSIS:EXPLORATORY_D... - [- ] process > DATA_ANALYSIS:UNIVARIATE_AN... - [- ] process > DATAANALYSIS:MULTIVARIATE... - [- ] process > PATHWAY_ANALYSIS_MULTIVARIATE - [- ] process > PATHWAY_ANALYSIS_UNIVARIATE - ERROR ~ Error executing process > 'SPECTRAL_PREPROCESSING:SOLVENT_SUPPRESSION (review)'
Caused by:
Process SPECTRAL_PREPROCESSING:SOLVENT_SUPPRESSION (review)
terminated with an error exit status (1)
Command executed:
solvent_suppresion.R --id "review" --fid_gdc "review_grouped_FIDdata_GDC.rds" --raw_rds "review_selected_fid_list.rds"
Command exit status: 1
Command output: (empty)
Command error:
Error in difsm(y = FidRe, lambda = lambda.ss) :
NA/NaN/Inf in foreign function call (arg 2)
Calls:
Work dir: /workspace1/Dropbox/Krzysztof/revisions/GIGAScience/nasqq-main/output/workdir_review/78/152663b306185410eeb3b2107a352a
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
-- Check '.nextflow.log' file for details
Hi, firstly, I edited the archive with the uploaded files on Saturday because it had a typo in params.yml and lack of batch column in metadata. Do you have the most recent version of it?
Secondly, apart from the errors from Nextflow, please provide:
Hi,
I took the input files from your zip linked in this thread (https://github.com/ardigen/nasqq/files/15443973/NASQQ_MTBLS9131.zip) But the params.yml from the zip file has been modified on Friday, 17∶11∶38. Is it the right one?
Oh, I edited params.yml on Friday and metadata.csv on Saturday, my bad. Those files seem to be fine. Please provide the following from points 1-3.
Okay, those files seem to be fine if the path /workspace1/Dropbox/Krzysztof/revisions/GIGAScience/nasqq-main is an absolute path. If not, I recommend using either an absolute path or a relative path starting from the directory where all input files are located e.g. ./metadata.csv, ./MTBLS9131/FILES/RAW_FILES. Now, the most crucial files I need are review_grouped_FIDdata_GDC.rds from the 7a/38d866 workdir and review_selected_fid_list.rds from the ff/701c1d workdir (it is shorten name of workdirs, it is much longer but Nextflow shows only this during execution).
Anyways, after resolving these issues, adding a "Debugging" section to the README might be a good idea to speed up the process of resolving potential bugs as it not so straightforward, especially for somebody without experience in Nextflow pipelines.
These files are in different locations, not in 7a and ff. Here is the full workdir:
Can you please remove entire workdir, output, all .nextflow.log(s) and run pipeline one more time? I see that it crashed on two samples 4 and 33, and I am trying to figure out why.
The same, crashes on Solvent suppression
So far, it seems like FIDs that you have in your directory are completely different then mine, either preprocessed by pipeline or even using manual function in R. Below few examples:
And that is one from yours output:
Can you please start from scratch? Here are the steps:
That's super strange, I did it one more time as I described above and it's running correctly...
it seems to be ok now, but crashed on warping, because I did not adjust the number of CPUs... I will let you know soon
OK, completed successfully, thanks!
I am getting the Group Delay Correction error when trying to process the dataset MTBLS9131(metabolights)
Command error: Error in Fid_info[1, "DECIM"] : subscript out of bounds Calls:
Execution halted
I have DECIM defined as 24 in all data folders.