Closed annegilewski closed 11 months ago
Hi Anne,
I see multiple errors in the config file. First, if you are planning to read/write in multiple directories (e.g./gilewski
and /Users/stevegilewski
) you should mount all of them.
So, from:
containerOptions = '-v /users/stevegilewski/metONTIIME:/users/stevegilewski/metONTIIME'
you should write instead:
containerOptions = '-v /Users/stevegilewski/:/Users/stevegilewski/ -v /gilewski:/gilewski'
Moreover, keep in mind that the file system is case sensitive, so you can not write /users
instead of /Users
(it's not the same path) and also the /
can not be omitted. I see in the config file you sometimes use them as they are interchangeable, but they are not.
Let me know if you are able to run it, after fixing these errors.
Best,
SM
Hi again,
I fixed the errors that you noted (hopefully got all of them), but I'm still exiting at the concatFastq process. I had adjusted the .config to 'false' for that process--is there another place I need to change? Docker is still giving me issues with routing the directory--I am pointing to both local and external, but it appears not save when I apply the change and restart in File Sharing. Can this code be run without Docker or is that not possible?
Hi, does --workDir
contain fastq.gz files (one for each sample)? If yes, it is ok to set concatenateFastq
process to false. Apparently there are issues with the mounting of /gilewski
directory. Can you move the files to /Users/stevegilewski
, change --workDir
and remove -v /gilewski:/gilewski
mounting options from the config file? This code can run with either docker or singularity, no other options. Are you sure docker installation is ok? You should try running a simple hello-world script to check it can run without sudo.
SM
I reinstalled Docker to be safe--that seems to be working fine. I have a bug post in with them regarding mounting the directories.
Yes, the working directory contains concatenated files for each pool (24 barcodes in each pool, seven pools total). I was trying to run analyses on one pool at a time. I can try moving the working directory--my impetus for keeping it on an external drive is that the files are quite large and my laptop doesn't have that much storage. I can try on our desktop Mac.
I'll see about running a hello-world this weekend. I did the Docker tutorials and it worked fine, but I'll try again with the reinstalled program.
Thanks again!
Progress! I was able to get to importfastq before it errored out. I have both directories on the external drive with Docker mounted correctly. I'm getting a duplicate file name error at the import stage and I'm not exactly sure how to correct. The file names were generated at sequencing--I thought they would be unique enough with the barcode number.
Any thoughts? Anne
Hi, it seems /Volumes/gilewski/ConcatFiles/Pool2
does not contain only one fastq.gz file for each sample. Please fix this, then remove /Volumes/gilewski/MetONTIIME/manifest.txt and
and /Volumes/gilewski/MetONTIIME/sample-metadata.tsv
files and restart the pipeline.
SM
Hi--that is really odd. I have 24 concatenated fastq.gz barcodes with no dupes in that pool. I'm removing the files and trying again.
Thank you!
manifest.txt and sample-metadata.tsv files are not generated if they are found. So, my guess is that at some point you specified as workDir
a directory containing multiple pools, with duplicated file names, and that's why you are getting the error. If this is the case, removing those two files should fix the issue.
SM
That didn't work--still getting the same error. In the line below, if I don't have the metadata.tsv file created, should I have have the path as below or just where I want the file to go IE: stop the path at ...MetONTIIME/?
//Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv"
Next try: I did another run with just five barcodes from one pool with the above line as is. These barcodes passed the first few processes.
It errored out again at derepSeq--are there other paths I need to have in the .config code?
Hi, it seems there are issues with the metadata file.
sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv"
is the correct way of specifying it, but keep in mind you should specify a different file for each pool, otherwise the wrong one is used, and samples are not going to be found. Same for the manifest.txt file, but this is ok for this pool, as the pipeline was able to reach this point. This behaviour with sample metadata file is intended, as one may want to add some metadata, to be used for subsequent analyses; while, if that file does not exist yet, it is created with minimal information.
SM
I think I figured out at least one error with my test data--the metadata file isn't formatted properly. I had erroneously thought the process would create a file with headers etc. Is there a format I need to follow with regards to setting up the .tsv? I did look at keemei, but it was unclear if there is a standard for this type of data.
There was an issue with loading the file /Volumes/gilewski/MetONTIIME/final_metadata.tsv as metadata:
Failed to locate header. The metadata file may be empty, or consists only of comments or empty rows.
There may be more errors present in the metadata file. To get a full report, sample/feature metadata files can be validated with Keemei: https://keemei.qiime2.org
Hi, if it doesn’t exist, the one created automatically will be compliant with QIIME2 requirements. Otherwise, it you want to create it yourself, have a look at QIIME2 documentation or validate it with Keemei. SM
Okay. So if the file doesn't exist (IE: I didn't make one), is this how the line should look?
//Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="/Volumes/gilewski/MetONTIIME/final_metadata.tsv"
Or do I stop the path at .../MetONTIIME
Me again. Sorry for all the posts. I am erroring out at the derepSeq step. The files are created in the appropriate file, but I don't know where the error is. I adjusted the CPU requirements, but perhaps that is the problem? Error code attached with screen cap of files. I have a .tsv file created with sample-id and absolute-file path headers to start.
Hi, sorry for the late reply, I have been (and still am) ill. I didn't see any specific error in the file you sent me, it just reports the exit code to be 1. From the web, "Exit Code 1 indicates that a container shut down, either because of an application failure or because the image pointed to an invalid file". I would just suggest to restart the pipeline. But I would suggest to run the analysis on the toy dataset available in the repository first. Best, SM
Very sorry to hear about your illness. Thank you for taking time to post. Right now, I am working with my own set of three barcodes from one pool. I have completed the pipeline to diversity analysis--
All metadata filtered after dropping columns that contained non-categorical data.
Which, after research, leads to believe my metadata file is not set up correctly. I would love for it to be created by the pipeline, but I am unclear how the param line should be scripted and if other parts of the script need to be adjusted downstream.
I'm trying this now: //Path to working directory including fastq.gz files workDir="/Volumes/gilewski/ConcatFiles/Test" //Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime sampleMetadata="."
Hi, you should specify the full path to a file which, if not existing yet, will be created, e.g.:
sampleMetadata="/Volumes/gilewski/MetONTIIME/sample-metadata.tsv"
SM
Got it. I was able to run the pipeline last night to the end with all visualizations and tables. The only change I made today was to the taxa level parameter--used same test data, didn't change anything else.
Docker is erroring out here: Command error: There was a problem importing /Volumes/gilewski/resultsDir/importFastq/manifest.txt:
/Volumes/gilewski/resultsDir/importFastq/manifest.txt is not a(n) SingleEndFastqManifestPhred33V2 file:
There was an issue with loading the metadata file:
Metadata IDs must be unique. The following IDs are duplicated: 'FAX22978_pass_barcode03_59c1dc36_e3ae5695'
The metadata.tsv is created, but won't transfer data from the manifest.txt. I am using exactly three concatenated files with no duplicates. I clear out the output files with each run, so I'm starting fresh. I freed up more room on my computer and upped the CPU usage via docker to 4 since I was running over 3 cores. I have restarted Docker and ensured all updates were done. My computer is also up to date.
I tried --resume to see if I can get the pipeline going again, but still error here today. Any thoughts? Again, no changes to the code from last night's sucessful run.
The error is quite self explaining, and indeed it seems to me (opening the file from the phone) there are duplicates in the manifest.txt file, arent’they? SM
Yes, but if I have the process create the metadata.tsv and my test data has no duplicates, I’m unclear why this happened. Is it as simple as deleting from the manifest.txt and resuming the process? Anne Gilewski, @. Dec 22, 2023, at 16:09, Simone Maestri @.> wrote: The error is quite self explaining, and indeed it seems to me (opening the file from the phone) there are indeed duplicates in the manifest.txt file, arent’they? SM
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
Yes, I have no idea why it contains duplicated lines, but you can manually remove them and start the pipeline from scratch, without the —resume parameter. If the manifest.txt file is found, it won’t be edited by the pipeline. SM
Great. I’ll give that a go. The successful run worked beautifully and we were really excited about it ☺️Happy Holidays! Anne Gilewski, @. Dec 22, 2023, at 20:10, Simone Maestri @.> wrote: Yes, I have no idea why it contains duplicated lines, but you can manually remove them and start the pipeline from scratch, without the —resume parameter. If the manifest.txt file is found, it won’t be edited by the pipeline. SM
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
Happy holidays to you as well! I am going to close the issue, feel free to reopen it in case you have any further questions! Best wishes, SM
Good evening!
I was wondering if I could trouble you to look over my .nf and .config code. I'm getting an error at the jump when executing the program--at the importdB process. My files are concatenated, so I had turned that process to 'false'.
I realize I'm still getting a docker file link issue and have mounted the directory to docker, but still having some issues. I had to do these as .odt as .rft isn't supported (I'm on a Mac)
MOTconfig.odt Error.odt MOTnf.odt
Thank you so much, Anne