Closed smb20200615 closed 1 year ago
Hi @smb20200615,
It really depends on what you want to do, but I see two steps where it seems reasonable to integrate MAGs from other pipelines:
At the bin_refinement step: here you would need 3 bin sets generated from a single assembly (i.e. same contigs/headers but binned differently). Maybe you have generated MAGs with a new binner or alternative MAG pipeline, so you could refine + reassemble them to get a final consensus set.
At the metabolic model reconstruction step: here you would use ORF-annotated protein bins to build metabolic models with CarveMe. Then you can simulate them within communities using SMETANA to predict metabolic interactions.
Please let me know if you had any other ideas in mind, or if you have further questions.
Best wishes, Francisco
Hi Francisco,
Thank you for your thorough comment. The reason I am struggling to make MAGs using your pipeline is because I don't have the account name for my cluster account. When I delete the account line, then I get errors during job submission. Perhaps the better question is how to run without the account name line
"__default__" : {
"account" : "your-account-name",
"time" : "0-06:00:00",
"n" : 48,
"tasks" : 1,
"mem" : 180G,
"name" : "DL.{rule}",
"output" : "logs/{wildcards}.%N.{rule}.out.log",
},
Hi @smb20200615,
Indeed that is quite strange, have you previously/successfully submitted any jobs on your cluster without an account name? I do not think that this should be possible, unless it is your own cluster/workstation. In my institution's SLURM based cluster one can can use the mybalance
command to view your accounts. If this does not work then I would suggest contacting your cluster support team or have a look at any documentation provided by your institiution's cluster.
Best wishes, Francisco
also sometimes these commands fail bash metaGEM.sh -t metabat -j 2 -c 24 -m 80 -h 10 bash metaGEM.sh -t maxbin -j 2 -c 24 -m 80 -h 10 (it seems there were no bins produced). Should we just proceed with downstream steps in that case?
Indeed, you can simply proceed to bin refinement and reassembly. let me know how it goes!
sorry for the subsequent question. When I run refinement it still tries to rerun the samples that failed. I have been following the steps of the tutorial.
No problem, happy to help with your questions.
OK that makes sense, since the binRefine
rule takes in 3 inputs and you are missing one of them.
https://github.com/franciscozorrilla/metaGEM/blob/eb0860945fd0d8efa8495aeb441b55969c4e97b1/Snakefile#L879-L883
I would recommend trying to "trick" snakemake into thinking it already created the files by creating a dummy folder for those samples that failed to generate any bins with maxbin, e.g. mkdir maxbin/sample/sample.maxbin-bins
. I believe that this is what I have done in the past, since metaWRAP
will accept an empty folder and just proceed with the remaining draft bin sets.
Please let me know if this works for you. If so, then I will try to modify the maxbin
rule so that it creates this dummy directory if the binning fails to generate any MAGs.
Thank you so much! That fixed it. I was wondering if the metabolic models were done just for the prokaryotic MAGs. Also, do you have guidance on how to adapt the media config parameter for our biome of interest?
Glad to hear it worked!
Indeed, CarveMe only reconstructs models for prokaryotic MAGs. Regarding media, this is something that you would have to adapt/design based on literature/domain knowledge and using metabolite IDs from the bigg database. What biome are your samples from?
Is there a way to input MAGs that we have generated using other pipelines into the tool? If so, at what point?