Closed ohickl closed 1 year ago
Thank you for using MetaDecoder. Which CAMI dataset causes this error? Please provide me the URL of the assembly and send the coverage file to congcong_liu@icloud.com, thank you very much.
In addition, these tips may help you. E.g. metadecoder cluster -f ASSEMBLY -c COVERAGE -s SEED -o OUTPUT The COVERAGE file is created using metadecoder coverage. The inputs are a series of sam files, please make sure that all sam files have the same @SQ lines that corresponds to the ASSEMBLY file.
Thanks for taking a look at this! I sent you the coverage file for a sample as well as a link to the assembly.
I run MetaDecoder on this dataset without any error.
STEP 1. java -jar camiClient.jar -d https://openstack.cebitec.uni-bielefeld.de:8080/swift/v1/CAMI_Urogenital_tract . -p short_read/2017.12.04_18.56.22_sample_2/contigs/anonymous_gsa.fasta.gz mv anonymous_gsa.fasta.gz METADECODER.ASSEMBLY
STEP 2. metadecoder seed -f METADECODER.ASSEMBLY -o METADECODER.SEED md5sum METADECODER.SEED (dbec6c77e574d9e885e49c1fbb11b485)
STEP 3. metadecoder cluster -f METADECODER.ASSEMBLY -s METADECODER.SEED -c METADECODER.COVERAGE -o METADECODER md5sum METADECODER.COVERAGE (4a092a997d8c5d17bd6828fd58e06302)
Got 45 clusters.
Python 3.8.10 and MetaDecoder 1.0.12. Step 3 was tested on both Linux 3.10.0-862.el7.x86_64 and MacOS 12.4.
I guess I know why.
2022-07-08 14:05:56 -> Running the DPGMM algorithm to obtain clusters. 2022-07-08 14:05:56 -> Done.
This means that you cached the dpgmm results, which are prefixed with the assembly file name. Usually the assembly file name is unique, so different dpgmm files are generated.
If you successively set different assembly files with the same name, it will cause the latter one to use the former's dpgmm cache.
Therefore, you may need to remove the dpgmm cache before running the metadecoder cluster as follows: rm *.dpgmm metadecoder cluster .......
I hope this will solve your problem.
Ah, makes sense. I was calling multiple jobs for different samples with identical assembly names from the same location. Changing the working dir to that of the respective output dir fixed it. Thanks! Might be useful to have a tmp_dir/work_dir flag for MetaDecoder, to allow the user to set it and avoid potential confusion?
Hi, I am trying to run MetaDecoder on some of the CAMI 2 toy samples, but for most of them get an error during the clustering step:
I am unsure what to make of this, as it runs fine for a few of samples. Let me know, if you require the COVERAGE or SEED file, or any others.
Best
Oskar