CostaLab / reg-gen

Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
https://reg-gen.readthedocs.io/
Other
103 stars 30 forks source link

warnings/errors in rgt-THOR #271

Open sunta3iouxos opened 4 months ago

sunta3iouxos commented 4 months ago

Hi all, as with #270 I created a mamba/conda environment as follows:

$mamba create -n rgts -c bioconda rgt
$mamba activate rgts

the python version is 3.7.12:

 $python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)
[GCC 9.4.0] on linux

I am running rgt-THOR with the folloing command:

$rgt-THOR KOUT_Vs_WTUT.config --merge --output-dir THOR/ --report --deadzones assemblies/mm39/annotation/blacklist.bed --rmdup --pvalue 0.01  --foldchange 4 --save-input --name THOR_KOUT_Vs_WTUT

at the Train HMM step I am getting the following warnings/errors:

**Use global TMM approach
Compute HMM's training set
Train HMM**
mambaforge/envs/rgts/lib/python3.7/site-packages/rgt/THOR/neg_bin_rep_hmm.py:295: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  self.alpha = np.matrix([[self.get_alpha(m) for m in np.asarray(self.mu[i])[0]] for i in range(self.n_features)])
mambaforge/envs/rgts/lib/python3.7/site-packages/rgt/THOR/neg_bin_rep_hmm.py:159: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  self.neg_distr = np.matrix([raw1, raw2]) #matrix of all Neg. Bin. Distributions, columns=HMM's state (3), row=#samples (2)
Compute HMM's posterior probabilities and Viterbi path to call differential peaks
- taking into account chr1
mambaforge/envs/rgts/lib/python3.7/site-packages/rgt/THOR/MultiCoverageSet.py:164: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  overall_coverage_strand = [[np.matrix(tmp2[0][0]), np.matrix(tmp2[0][1])],
mambaforge/envs/rgts/lib/python3.7/site-packages/rgt/THOR/MultiCoverageSet.py:165: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  [np.matrix(tmp2[1][0]), np.matrix(tmp2[0][1])]]
mambaforge/envs/rgts/lib/python3.7/site-packages/rgt/THOR/MultiCoverageSet.py:167: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  overall_coverage = [np.matrix(tmp[0]), np.matrix(tmp[1])]
**Compute HMM's posterior probabilities and Viterbi path to call differential peaks**

the following seem to be related to the bigwig as in #270

**wigToBigWig v 4** - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.

And this goes on for all chromosomes checked.