phac-nml / mob-suite

MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Apache License 2.0
118 stars 31 forks source link

How to specify -p and -t arguments in mob cluster without any information #120

Closed Abelcanc3rhack3r closed 1 year ago

Abelcanc3rhack3r commented 2 years ago

Hi, I managed to install the MOBsite via pip.

Now I have some other problem.

We have one fasta file containing 50 random plasmid sequences. Nothing else is known about plasmids.

We want to cluster the plasmid sequences. We use command mob_cluster, we put the fasta file as infile, but what do we put as the -p and the -t? it wont run without the two arguments.

Thanks

kbessonov1984 commented 2 years ago

Hi,

The mob_cluster tool is used to update or build from scratch a plasmid database. There are some instructions on how to build and update database in this comment https://github.com/phac-nml/mob-suite/issues/20#issuecomment-476278194.

You can run mob_typer and see to which primary and secondary cluster(s) your 50 plasmid sequences will be assigned.

Provide more information on what you are trying to accomplish with a set of plasmids and what results are you getting from mob-typer.

Abelcanc3rhack3r commented 2 years ago

We are just trying to cluster the plasmids with each other, not necessarily with other known plasmids in a database

ok, I wil run the mob typer and share with you the results when we get them

luciagrami commented 1 year ago

Hello,

I have the same question. I run MOB typer on the new set of plasmids, but no sure how to create the taxonomy file, and also I couldn't find a sample file, so not sure how the taxonomy file looks like.

Thanks!

jrober84 commented 1 year ago

I will update the documentation for MOB-cluster to improve it as there are few details available currently in the repo. But in the meantime, the taxonomy file is just a two column tab-delimeted file with the header id,organism. Where id is the sequence id of your plasmids you wish to add to your database and organism is the complete name that you want to associate with your sequence. The organism name must be present in the NCBI taxonomy for it to work correctly.

luciagrami commented 1 year ago

Thanks!