wwood / singlem

Novelty-inclusive microbial community profiling of shotgun metagenomes
http://wwood.github.io/singlem/
GNU General Public License v3.0
124 stars 16 forks source link

Targeting functional genes using SingleM #62

Open daisyzhangsysu opened 3 years ago

daisyzhangsysu commented 3 years ago

Hi Ben, I am wondering if i can use SingleM to builds OTU tables from functional genes, for example, targeting dsrA with a dsrA package from graftM.

wwood commented 3 years ago

Hi,

I have not tried this, but I do not see why not. You would just need to choose an appropriate starting position as well as making the package - I'd do this using metagenomes that contain a diversity of dsr genes. HTH, ben

daisyzhangsysu commented 3 years ago

Thanks so much! You are right. I am working with metagenomes. But i don't know how to make a singleM package. Can you provide a introduction to do this?

wwood commented 3 years ago

Once you have a graftm package it is striaghtforward from a software perspective - you can just run singlem create which takes a graftm package and a position in the HMM to start the window (ie what I was on about above).

daisyzhangsysu commented 3 years ago

Thanks! I tried singlem create and there are two arguments required (--hmm_position  --window_size). I randomly set these two values, and then SingleM-compatible package creation finished. I am not sure whether these randomly set arguments would affect the final result? 

    ------------------ Original ------------------ From:  "Ben J Woodcroft"<notifications@github.com>; Date:  Fri, Dec 18, 2020 12:08 PM To:  "wwood/singlem"<singlem@noreply.github.com>; Cc:  "daisyzhangsysu"<zhangchw3@mail2.sysu.edu.cn>; "Author"<author@noreply.github.com>; Subject:  Re: [wwood/singlem] Targeting functional genes using SingleM (#62)

 

Once you have a graftm package it is striaghtforward from a software perspective - you can just run singlem create which takes a graftm package and a position in the HMM to start the window (ie what I was on about above).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wwood commented 3 years ago

Hi,

Yes those matter, and quite a lot. For the window size I'd just use the default (it isn't required, there's a mistake in the help documentation I'll fix).

--hmm-position is the harder one. It specifies which position in the HMM that translated reads are aligned to that the window starts, and it is different for each HMM. In general you want the position that has the most conservation amongst all dsr genes.

To find that position you can run graftm graft on some metagenomes that contain a diversity of dsr genes, and then run singlem seqs to find the best hmm_position, and then finally singlem create.

One thing to note, in the released version of singlem v0.13.2, there's a bug in singlem seqs, so you'll want to work off the dev branch. So

git clone https://github.com/wwood/singlem.git
cd singlem
git checkout -b dev origin/dev
conda activate your-singlem-environment
bin/singlem -h
graftm graft ...
singlem seqs ....
singlem create ...

It's a little convoluted as it isn't done too often, I'm afraid.

houjialin commented 1 year ago

Hi Ben,

I tried --singlem-packages function of singlem pipe for classifying the metagenome, whatever with the custom database I created or the default ribosomal singlem package, but all failed with the following error. Can you give me some information about this problem?

my command with the default package

singlem pipe --forward ../SRR7224128_1.fastq --reverse ../SRR7224128_2.fastq --singlem-packages ~/Software/singlem-main/data/dbs/S3.1.0.metapackage_20221209.smpkg.zb/payload_directory/S3.12.ribosomal_L1.spkg/ --otu-table otu_table.tsv --threads 50

error information

02/21/2023 03:07:06 PM INFO: SingleM v1.0.0beta5
02/21/2023 03:07:06 PM INFO: Loaded 1 SingleM packages
02/21/2023 03:07:06 PM INFO: Using as input 1 different pairs of sequence files e.g. ../SRR7224128_1.fastq & ../SRR7224128_2.fastq
02/21/2023 03:07:06 PM INFO: Filtering sequence files through DIAMOND blastx
02/21/2023 03:10:50 PM INFO: Finished DIAMOND prefilter phase
02/21/2023 03:10:50 PM INFO: Assigning sequences to SingleM packages with DIAMOND ..
02/21/2023 03:10:54 PM INFO: Running taxonomic assignment ..
02/21/2023 03:10:54 PM INFO: Assigning taxonomy by singlem query ..
Traceback (most recent call last):
  File "/home/houjialin/Software/singlem-main/bin/singlem", line 589, in <module>
    singlem.pipe.SearchPipe().run(
  File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 63, in run
    otu_table_object = self.run_to_otu_table(**kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 421, in run_to_otu_table
    otu_table_object = self.assign_taxonomy_and_process(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 456, in assign_taxonomy_and_process
    assignment_result = self._assign_taxonomy(
                        ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 1173, in _assign_taxonomy
    query_based_assignment_result = PipeTaxonomyAssignerByQuery().assign_taxonomy(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe_taxonomy_assigner_by_query.py", line 63, in assign_taxonomy
    sdb = SequenceDatabase.acquire(assignment_singlem_db)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/houjialin/Software/singlem-main/bin/../singlem/sequence_database.py", line 216, in acquire
    contents_path = os.path.join(
                    ^^^^^^^^^^^^^
  File "<frozen posixpath>", line 76, in join
TypeError: expected str, bytes or os.PathLike object, not NoneType
wwood commented 1 year ago

Hi,

Thanks for the report.

This is a bug in that it doesn't say what the problem is, at least. The issue is that it needs a singlem db in the metapackage to use taxonomy assignment by snafa naive. Does assignment-method diamond work?

-------------- Ben Woodcroft Group leader, Centre for Microbiome Research, QUT


From: houjialin @.> Sent: Tuesday, February 21, 2023 5:35:03 PM To: wwood/singlem @.> Cc: Ben J Woodcroft @.>; Comment @.> Subject: Re: [wwood/singlem] Targeting functional genes using SingleM (#62)

Hi Ben,

I tried --singlem-packages function of singlem pipe for classifying the metagenome, whatever with the custom database I created or the default ribosomal singlem package, but all failed with the following error. Can you give me some information about this problem?

my command with the default package

singlem pipe --forward ../SRR7224128_1.fastq --reverse ../SRR7224128_2.fastq --singlem-packages ~/Software/singlem-main/data/dbs/S3.1.0.metapackage_20221209.smpkg.zb/payload_directory/S3.12.ribosomal_L1.spkg/ --otu-table otu_table.tsv --threads 50

error information

02/21/2023 03:07:06 PM INFO: SingleM v1.0.0beta5 02/21/2023 03:07:06 PM INFO: Loaded 1 SingleM packages 02/21/2023 03:07:06 PM INFO: Using as input 1 different pairs of sequence files e.g. ../SRR7224128_1.fastq & ../SRR7224128_2.fastq 02/21/2023 03:07:06 PM INFO: Filtering sequence files through DIAMOND blastx 02/21/2023 03:10:50 PM INFO: Finished DIAMOND prefilter phase 02/21/2023 03:10:50 PM INFO: Assigning sequences to SingleM packages with DIAMOND .. 02/21/2023 03:10:54 PM INFO: Running taxonomic assignment .. 02/21/2023 03:10:54 PM INFO: Assigning taxonomy by singlem query .. Traceback (most recent call last): File "/home/houjialin/Software/singlem-main/bin/singlem", line 589, in singlem.pipe.SearchPipe().run( File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 63, in run otu_table_object = self.run_to_otu_table(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 421, in run_to_otu_table otu_table_object = self.assign_taxonomy_and_process( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 456, in assign_taxonomy_and_process assignment_result = self._assign_taxonomy( ^^^^^^^^^^^^^^^^^^^^^^ File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe.py", line 1173, in _assign_taxonomy query_based_assignment_result = PipeTaxonomyAssignerByQuery().assign_taxonomy( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/houjialin/Software/singlem-main/bin/../singlem/pipe_taxonomy_assigner_by_query.py", line 63, in assign_taxonomy sdb = SequenceDatabase.acquire(assignment_singlem_db) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/houjialin/Software/singlem-main/bin/../singlem/sequence_database.py", line 216, in acquire contents_path = os.path.join( ^^^^^^^^^^^^^ File "", line 76, in join TypeError: expected str, bytes or os.PathLike object, not NoneType

― Reply to this email directly, view it on GitHubhttps://github.com/wwood/singlem/issues/62#issuecomment-1437980527, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAADX5DWU4TA5KMWTCIFSLDWYRV2PANCNFSM4VAS2DCQ. You are receiving this because you commented.Message ID: @.***>