MaestSi / MetONTIIME

A Meta-barcoding pipeline for analysing ONT data in QIIME2 framework
GNU General Public License v3.0
78 stars 17 forks source link

Failure to Launch... #80

Closed annegilewski closed 11 months ago

annegilewski commented 11 months ago

Good evening!

I was wondering if I could trouble you to look over my .nf and .config code. I'm getting an error at the jump when executing the program--at the importdB process. My files are concatenated, so I had turned that process to 'false'.

I realize I'm still getting a docker file link issue and have mounted the directory to docker, but still having some issues. I had to do these as .odt as .rft isn't supported (I'm on a Mac)

MOTconfig.odt Error.odt MOTnf.odt

Thank you so much, Anne

MaestSi commented 11 months ago

Hi Anne, I see multiple errors in the config file. First, if you are planning to read/write in multiple directories (e.g./gilewski and /Users/stevegilewski) you should mount all of them. So, from: containerOptions = '-v /users/stevegilewski/metONTIIME:/users/stevegilewski/metONTIIME' you should write instead: containerOptions = '-v /Users/stevegilewski/:/Users/stevegilewski/ -v /gilewski:/gilewski' Moreover, keep in mind that the file system is case sensitive, so you can not write /users instead of /Users (it's not the same path) and also the / can not be omitted. I see in the config file you sometimes use them as they are interchangeable, but they are not. Let me know if you are able to run it, after fixing these errors. Best, SM

annegilewski commented 11 months ago

Hi again,

I fixed the errors that you noted (hopefully got all of them), but I'm still exiting at the concatFastq process. I had adjusted the .config to 'false' for that process--is there another place I need to change? Docker is still giving me issues with routing the directory--I am pointing to both local and external, but it appears not save when I apply the change and restart in File Sharing. Can this code be run without Docker or is that not possible?

Error12.8.odt Updatedconfig.odt

MaestSi commented 11 months ago

Hi, does --workDir contain fastq.gz files (one for each sample)? If yes, it is ok to set concatenateFastq process to false. Apparently there are issues with the mounting of /gilewski directory. Can you move the files to /Users/stevegilewski, change --workDir and remove -v /gilewski:/gilewski mounting options from the config file? This code can run with either docker or singularity, no other options. Are you sure docker installation is ok? You should try running a simple hello-world script to check it can run without sudo. SM

annegilewski commented 11 months ago

I reinstalled Docker to be safe--that seems to be working fine. I have a bug post in with them regarding mounting the directories.

Yes, the working directory contains concatenated files for each pool (24 barcodes in each pool, seven pools total). I was trying to run analyses on one pool at a time. I can try moving the working directory--my impetus for keeping it on an external drive is that the files are quite large and my laptop doesn't have that much storage. I can try on our desktop Mac.

I'll see about running a hello-world this weekend. I did the Docker tutorials and it worked fine, but I'll try again with the reinstalled program.

Thanks again!

annegilewski commented 11 months ago

Progress! I was able to get to importfastq before it errored out. I have both directories on the external drive with Docker mounted correctly. I'm getting a duplicate file name error at the import stage and I'm not exactly sure how to correct. The file names were generated at sequencing--I thought they would be unique enough with the barcode number.

Any thoughts? Anne

importFastQ.odt

MaestSi commented 11 months ago

Hi, it seems /Volumes/gilewski/ConcatFiles/Pool2 does not contain only one fastq.gz file for each sample. Please fix this, then remove /Volumes/gilewski/MetONTIIME/manifest.txt and and /Volumes/gilewski/MetONTIIME/sample-metadata.tsv files and restart the pipeline. SM

annegilewski commented 11 months ago

Hi--that is really odd. I have 24 concatenated fastq.gz barcodes with no dupes in that pool. I'm removing the files and trying again.

Thank you!

Screenshot 2023-12-12 at 3 25 23 PM
MaestSi commented 11 months ago

manifest.txt and sample-metadata.tsv files are not generated if they are found. So, my guess is that at some point you specified as workDir a directory containing multiple pools, with duplicated file names, and that's why you are getting the error. If this is the case, removing those two files should fix the issue. SM

annegilewski commented 11 months ago

That didn't work--still getting the same error. In the line below, if I don't have the metadata.tsv file created, should I have have the path as below or just where I want the file to go IE: stop the path at ...MetONTIIME/?

//Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv"

annegilewski commented 11 months ago

Next try: I did another run with just five barcodes from one pool with the above line as is. These barcodes passed the first few processes.

It errored out again at derepSeq--are there other paths I need to have in the .config code?

Error 12.12.odt

MaestSi commented 11 months ago

Hi, it seems there are issues with the metadata file. sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv" is the correct way of specifying it, but keep in mind you should specify a different file for each pool, otherwise the wrong one is used, and samples are not going to be found. Same for the manifest.txt file, but this is ok for this pool, as the pipeline was able to reach this point. This behaviour with sample metadata file is intended, as one may want to add some metadata, to be used for subsequent analyses; while, if that file does not exist yet, it is created with minimal information. SM

annegilewski commented 11 months ago

I think I figured out at least one error with my test data--the metadata file isn't formatted properly. I had erroneously thought the process would create a file with headers etc. Is there a format I need to follow with regards to setting up the .tsv? I did look at keemei, but it was unclear if there is a standard for this type of data.

There was an issue with loading the file /Volumes/gilewski/MetONTIIME/final_metadata.tsv as metadata:

Failed to locate header. The metadata file may be empty, or consists only of comments or empty rows.

There may be more errors present in the metadata file. To get a full report, sample/feature metadata files can be validated with Keemei: https://keemei.qiime2.org
MaestSi commented 11 months ago

Hi, if it doesn’t exist, the one created automatically will be compliant with QIIME2 requirements. Otherwise, it you want to create it yourself, have a look at QIIME2 documentation or validate it with Keemei. SM

annegilewski commented 11 months ago

Okay. So if the file doesn't exist (IE: I didn't make one), is this how the line should look?

//Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="/Volumes/gilewski/MetONTIIME/final_metadata.tsv"

Or do I stop the path at .../MetONTIIME

annegilewski commented 11 months ago

Me again. Sorry for all the posts. I am erroring out at the derepSeq step. The files are created in the appropriate file, but I don't know where the error is. I adjusted the CPU requirements, but perhaps that is the problem? Error code attached with screen cap of files. I have a .tsv file created with sample-id and absolute-file path headers to start.

Screenshot 2023-12-18 at 5 12 33 PM

error 12.18.odt

MaestSi commented 11 months ago

Hi, sorry for the late reply, I have been (and still am) ill. I didn't see any specific error in the file you sent me, it just reports the exit code to be 1. From the web, "Exit Code 1 indicates that a container shut down, either because of an application failure or because the image pointed to an invalid file". I would just suggest to restart the pipeline. But I would suggest to run the analysis on the toy dataset available in the repository first. Best, SM

annegilewski commented 11 months ago

Very sorry to hear about your illness. Thank you for taking time to post. Right now, I am working with my own set of three barcodes from one pool. I have completed the pipeline to diversity analysis--

All metadata filtered after dropping columns that contained non-categorical data.

Which, after research, leads to believe my metadata file is not set up correctly. I would love for it to be created by the pipeline, but I am unclear how the param line should be scripted and if other parts of the script need to be adjusted downstream.

I'm trying this now: //Path to working directory including fastq.gz files workDir="/Volumes/gilewski/ConcatFiles/Test" //Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="."

MaestSi commented 11 months ago

Hi, you should specify the full path to a file which, if not existing yet, will be created, e.g.: sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv" SM

annegilewski commented 11 months ago

Got it. I was able to run the pipeline last night to the end with all visualizations and tables. The only change I made today was to the taxa level parameter--used same test data, didn't change anything else.

Docker is erroring out here: Command error: There was a problem importing /Volumes/gilewski/resultsDir/importFastq/manifest.txt:

/Volumes/gilewski/resultsDir/importFastq/manifest.txt is not a(n) SingleEndFastqManifestPhred33V2 file:

There was an issue with loading the metadata file:

Metadata IDs must be unique. The following IDs are duplicated: 'FAX22978_pass_barcode03_59c1dc36_e3ae5695'

The metadata.tsv is created, but won't transfer data from the manifest.txt. I am using exactly three concatenated files with no duplicates. I clear out the output files with each run, so I'm starting fresh. I freed up more room on my computer and upped the CPU usage via docker to 4 since I was running over 3 cores. I have restarted Docker and ensured all updates were done. My computer is also up to date.

I tried --resume to see if I can get the pipeline going again, but still error here today. Any thoughts? Again, no changes to the code from last night's sucessful run.

manifest.txt

MaestSi commented 11 months ago

The error is quite self explaining, and indeed it seems to me (opening the file from the phone) there are duplicates in the manifest.txt file, arent’they? SM

annegilewski commented 11 months ago

Yes, but if I have the process create the metadata.tsv and my test data has no duplicates, I’m unclear why this happened. Is it as simple as deleting from the manifest.txt and resuming the process? Anne Gilewski, @. Dec 22, 2023, at 16:09, Simone Maestri @.> wrote: The error is quite self explaining, and indeed it seems to me (opening the file from the phone) there are indeed duplicates in the manifest.txt file, arent’they? SM

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

MaestSi commented 11 months ago

Yes, I have no idea why it contains duplicated lines, but you can manually remove them and start the pipeline from scratch, without the —resume parameter. If the manifest.txt file is found, it won’t be edited by the pipeline. SM

annegilewski commented 11 months ago

Great. I’ll give that a go. The successful run worked beautifully and we were really excited about it ☺️Happy Holidays! Anne Gilewski, @. Dec 22, 2023, at 20:10, Simone Maestri @.> wrote: Yes, I have no idea why it contains duplicated lines, but you can manually remove them and start the pipeline from scratch, without the —resume parameter. If the manifest.txt file is found, it won’t be edited by the pipeline. SM

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

MaestSi commented 11 months ago

Happy holidays to you as well! I am going to close the issue, feel free to reopen it in case you have any further questions! Best wishes, SM