Inmembrane locally installed TMHMM, SignalP, LipoP #4

Closed Dimpledavray287 closed 4 years ago

Dimpledavray287 commented 4 years ago

Hi Boscoh I am interested to identify surface exposed protein in Lactobacillus by using inmembrane tool. I have appox 7000 sequences and for this i need locally install TMHMM, SignalP, LipoP. Which have done. I don't know where to change path in the inmembrane files so that all the web services TMHMM, SignalP, LipoP will be use locally through inmembrane programs

Thanks Dimple

pansapiens commented 4 years ago

If you run inmembrane_scan once (with no extra args) it will generate a default config file inmembrane.config in the current working directory. You can edit this to change paths the various tools (eg the signalp4_bin, lipop1_bin and tmhmm_bin variables).

It works best if you put the executables for each tool on your PATH environment variable and just use eg 'signalp4_bin': 'signalp', in the inmembrane.config file.

Dimpledavray287 commented 4 years ago

Thanks @pansapiens for your prompt reply.

I had put the executables for each tool on the PATH environment variable. When i used it for low sequence (below 500) it worked fine (Output1 mentioned below) . When i increasing the number of sequence above 1000. It throw error (Output2 mention below)

Command used to move or copy the executable flle in the PATH environment A) SignalP I had transferred all executable of signalp 4.1 which were present inside 'bin' Command :sudo mv nnhowplayer.Linux_x86_64 /usr/local/bin/nnhowplayer.Linux_x86_64 ((Likewise i had moved all the nnhowplayer)

B) tmhmm bin/decodeanhmm Binary executable Command : sudo mv decodeanhmm.Linux_x86_64 /usr/local/bin/decodeanhmm.Linux_x86_64 (Likewise i had moved all the decodeanhmm)

C) LipoP :Similarly all the files from LipoP to /usr/local/bin

Output1 : -

 Number of proteins in each class:
# CYTOPLASM(non-PSE)    379
# MEMBRANE(non-PSE) 87
# PSE(total)        32
#   PSE-Cellwall    7
#   PSE-Lipoprotein 8
#   PSE-Membrane    17
# SECRETED          13
# Output written to /home/dimple/Phd/inmembrane-0.95.0/Trylatest5.csv
# This run used SignalP 4.1, LipoP 1.0 (web interface), HMMER 3.0, TMHMM 2.0.
# References have been written to /home/dimple/Phd/inmembrane-0.95.0/Trylatest5/citations.txt 
# - please cite as appropriate.

Output2 : -

dimple@dimple-VirtualBox[inmembrane-0.95.0] inmembrane_scan Trylatest6.txt  
/home/dimple/.local/lib/python2.7/site-packages/ UserWarning: You are using a very old release of Beautiful Soup, last updated in 2011. If you installed the 'beautifulsoup' package through pip, you should know the 'beautifulsoup' package name is about to be reclaimed by a more recent version of Beautiful Soup which is incompatible with this version.

This will happen at some point after January 1, 2021.

If you just started this project, this is easy to fix. Install the 'beautifulsoup4' package instead of 'beautifulsoup' and start using Beautiful Soup 4.

If this is an existing project that depends on Beautiful Soup 3, the project maintainer (potentially you) needs to start the process of migrating to Beautiful Soup 4. This should be a relatively easy part of the Python 3 migration.

# inmembrane 0.95.0 (
# Loading existing inmembrane.config
# SignalP(scrape_web), input.fasta > signalp_scrape_web.out
Traceback (most recent call last):
  File "/usr/local/bin/inmembrane_scan", line 87, in <module>
  File "/usr/local/lib/python2.7/dist-packages/inmembrane/", line 139, in process
    plugin.annotate(params, proteins)
  File "/usr/local/lib/python2.7/dist-packages/inmembrane/plugins/", line 111, in annotate
    pollingurl = soup.findAll('a')[0]['href']
IndexError: list index out of range
pansapiens commented 4 years ago

From the error, it looks as if it's using the SignalP web service rather than the locally installed version. You should check that 'signalp4_bin': 'signalp' is set in your inmembrane.config and that there is no signalp_scrape_web in the config.

If you post your inmembrane.config here it might help diagnose.

Dimpledavray287 commented 4 years ago

You may right. I have posted inmenbrane.config file.

  'fasta': '',
  'csv': '',
  'out_dir': '',
  'protocol': 'gram_pos', # 'gram_neg'

#### Signal peptide and transmembrane helix prediction
#   'signalp4_bin': 'signalp',
  'signalp4_bin': 'signalp_scrape_web',
#   'lipop1_bin': 'LipoP',
  'lipop1_bin': 'lipop_scrape_web',
#   'tmhmm_bin': 'tmhmm',
  'tmhmm_bin': 'tmhmm_scrape_web',
   'memsat3_bin': 'runmemsat',
  'helix_programs': ['tmhmm'],
# 'helix_programs': ['tmhmm', 'memsat3'],
  'terminal_exposed_loop_min': 50, # unused in gram_neg protocol
  'internal_exposed_loop_min': 100, # try 30 for gram_neg

#### Sequence similarity and motif prediction
  'hmmsearch3_bin': 'hmmsearch',
  'hmm_evalue_max': 0.1,
  'hmm_score_min': 10,

#### Outer membrane beta-barrel predictors
  'barrel_programs': ['tmbetadisc-rbf'],
# 'barrel_programs': ['bomp', 'tmbetadisc-rbf'],
  'bomp_clearly_cutoff': 3, # if >= than this, always classify as an OM(barrel)
  'bomp_maybe_cutoff': 1, # must also have a signal peptide to be OM(barrel)
  'tmbetadisc_rbf_method': 'aadp', # aa, dp, aadp or pssm
Dimpledavray287 commented 4 years ago

I am not sure but two possibilities

1) we should do change in the inmembrane_scan file . Because it loading all the file present in the plugin directory and directory have both and

import inmembrane
# from inmembrane import helpers
from inmembrane.helpers import *
# will load all plugins in the plugins/ directory
from inmembrane.plugins import *
import unittest 

2) executables tool not properly added on the PATH environment variable.

dimple@dimple-VirtualBox[inmembrane] echo $PATH                      

dimple@dimple-VirtualBox[inmembrane] cd /usr/local/bin               
pansapiens commented 4 years ago

It looks like you do need to change your inmembrane.config file - remove any of the lines with _scrape_web.

The section with the signalp/tmhmm/lipop settings should look like:

  'signalp4_bin': 'signalp',
  'lipop1_bin': 'LipoP',
  'tmhmm_bin': 'tmhmm-2.0c',

Note the executable name for tmhmm, based on what you have in /usr/local/bin, is tmhmm-2.0c. Putting the full path (/usr/local/bin/tmhmm-2.0c) in the config file should work too, if I remember correctly.

The binaries for the tools are in your PATH, so that part looks fine. Also ensure you've followed the SignalP etc install instructions - they also require copying the content of lib to /usr/local/lib (see, or the README in the SignalP tarball - seems the 4.1 readme link is dead now).

To be sure, you can try running each tool on it's own with a small FASTA file to make sure they are installed correctly (the inmembrane_scan -t -n command tests that all the locally installed tools run as expected).

No need to modify inmembrane_scan - all the plugins are loaded at startup, but only the ones set via inmembrane.config are actually used during the protocol.

Dimpledavray287 commented 4 years ago

Thankyou very much @pansapiens . It worked .....

Number of proteins in each class:
# CYTOPLASM(non-PSE)    5040
# MEMBRANE(non-PSE) 1187
# PSE(total)        513
#   PSE-Cellwall    108
#   PSE-Membrane    405
# SECRETED          264
# Output written to /home/dimple/Phd/inmembrane-0.95.0/Hypothetical.csv
# This run used SignalP 4.0, LipoP 1.0, HMMER 3.0, TMHMM 2.0.
# References have been written to /home/dimple/Phd/inmembrane-0.95.0/Hypothetical/citations.txt 
# - please cite as appropriate.
pansapiens commented 4 years ago

Great !