RiversDong / GageTracker

tool for dating gene age by micro- and macro-synteny with high speed and accuracy
10 stars 4 forks source link

Windowmasker #1

Closed DiegoSafian closed 1 month ago

DiegoSafian commented 2 months ago

Hi, I have had problems installing Windowmasker. I essentially do not know how to do it. I thought I could skip it, but apparently is not possible. Please, see attachment age_dating.txt

Any suggestion?

Kind regards, Diego

RiversDong commented 2 months ago

Thank you for your feedback.

Could you please let us know where you downloaded Windowmasker? It’s a subprogram of BLAST+. I just tested Windowmasker from ncbi-blast-2.2.30+, and this step ran successfully.

If you installed Windowmasker from ncbi-blast-2.2.30+ but the error persists, it’s possible that your system is missing the ZLIB library. Please ensure that this library is installed before using GageTracker.

wget https://zlib.net/fossils/zlib-1.2.9.tar.gz tar -zxvf zlib-1.2.9.tar.gz cd zlib-1.2.9 ./configure --prefix=$HOME/zlib-1.2.9 make make install

If you encounter any issues in the future, please feel free to contact us at any time. We sincerely appreciate your feedback.

DiegoSafian commented 2 months ago

Hi,

I running it on SLURM, so I was using:

Currently Loaded Modules:
  1) XZ/5.2.7-GCCcore-12.2.0           10) bzip2/1.0.8-GCCcore-12.2.0   19) libjpeg-turbo/2.1.4-GCCcore-12.2.0  28) OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1
  2) libxml2/2.10.3-GCCcore-12.2.0     11) PCRE/8.45-GCCcore-12.2.0     20) LMDB/0.9.29-GCCcore-12.2.0          29) gompi/2016b
  3) libpciaccess/0.17-GCCcore-12.2.0  12) gzip/1.12-GCCcore-12.2.0     21) BLAST+/2.14.0-gompi-2022b           30) FFTW/3.3.4-gompi-2016b
  4) OpenSSL/1.1                       13) lz4/1.9.4-GCCcore-12.2.0     22) GCCcore/5.4.0                       31) ScaLAPACK/2.0.2-gompi-2016b-OpenBLAS-0.2.18-LAPACK-3.6.1
  5) libevent/2.1.12-GCCcore-12.2.0    14) zstd/1.5.2-GCCcore-12.2.0    23) binutils/2.26-GCCcore-5.4.0         32) foss/2016b
  6) UCX/1.13.1-GCCcore-12.2.0         15) ICU/72.1-GCCcore-12.2.0      24) GCC/5.4.0-2.26                      33) zlib/1.2.8-foss-2016b
  7) libfabric/1.16.1-GCCcore-12.2.0   16) Boost/1.81.0-GCC-12.2.0      25) numactl/2.0.11-GCC-5.4.0-2.26       34) libpng/1.2.59-foss-2016b
  8) PMIx/4.2.2-GCCcore-12.2.0         17) GMP/6.2.1-GCCcore-12.2.0     26) hwloc/1.11.3-GCC-5.4.0-2.26         35) Kent_tools/20190117-linux.x86_64
  9) UCC/1.1.0-GCCcore-12.2.0          18) NASM/2.15.05-GCCcore-12.2.0  27) OpenMPI/1.10.3-GCC-5.4.0-2.26

with a conda environment that has all the required packages.

I have now updated the conda environment and installed: conda install conda-forge::zlib and conda install bioconda::blast. And it seems that this issue is solved.

This is the current conda environment ...

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
biopython                 1.79             py38h0a891b7_3    conda-forge
blast                     2.16.0               hc155240_2    bioconda
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.33.1               heb4867d_0    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
curl                      8.9.1                h18eb788_0    conda-forge
entrez-direct             22.4                 he881be0_0    bioconda
freetype                  2.12.1               h267a509_2    conda-forge
gettext                   0.22.5               he02047a_3    conda-forge
gettext-tools             0.22.5               he02047a_3    conda-forge
gtfparse                  1.2.1                    pypi_0    pypi
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
last                      1574                 h43eeafb_0    bioconda
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libasprintf               0.22.5               he8f35ee_3    conda-forge
libasprintf-devel         0.22.5               he8f35ee_3    conda-forge
libblas                   3.9.0           23_linux64_openblas    conda-forge
libcblas                  3.9.0           23_linux64_openblas    conda-forge
libcurl                   8.9.1                hdb1bdb2_0    conda-forge
libdeflate                1.21                 h4bc722e_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    14.1.0               h77fa898_1    conda-forge
libgcc-ng                 14.1.0               h69a702a_1    conda-forge
libgettextpo              0.22.5               he02047a_3    conda-forge
libgettextpo-devel        0.22.5               he02047a_3    conda-forge
libgfortran               14.1.0               h69a702a_1    conda-forge
libgfortran-ng            14.1.0               h69a702a_1    conda-forge
libgfortran5              14.1.0               hc5f4f2c_1    conda-forge
libgomp                   14.1.0               h77fa898_1    conda-forge
libidn2                   2.3.7                hd590300_0    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
liblapack                 3.9.0           23_linux64_openblas    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx                 14.1.0               hc0a3c3a_1    conda-forge
libstdcxx-ng              14.1.0               h4852527_1    conda-forge
libtiff                   4.6.0                h46a8edc_4    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.16                 hb9d3cd8_1    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
maf2synteny               1.2                  hdbdd923_3    bioconda
ncbi-vdb                  3.1.1                h4ac6f70_1    bioconda
ncurses                   6.5                  he02047a_1    conda-forge
numpy                     1.24.4           py38h59b608b_0    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.1                hb9d3cd8_3    conda-forge
pandas                    1.4.3                    pypi_0    pypi
parallel                  20240722             ha770c72_0    conda-forge
perl                      5.32.1          7_hd590300_perl5    conda-forge
perl-archive-tar          2.40            pl5321hdfd78af_0    bioconda
perl-carp                 1.50            pl5321hd8ed1ab_0    conda-forge
perl-common-sense         3.75            pl5321hd8ed1ab_0    conda-forge
perl-compress-raw-bzip2   2.201           pl5321h166bdaf_0    conda-forge
perl-compress-raw-zlib    2.202           pl5321h166bdaf_0    conda-forge
perl-encode               3.21            pl5321hd590300_0    conda-forge
perl-exporter             5.74            pl5321hd8ed1ab_0    conda-forge
perl-exporter-tiny        1.002002        pl5321hd8ed1ab_0    conda-forge
perl-extutils-makemaker   7.70            pl5321hd8ed1ab_0    conda-forge
perl-io-compress          2.201           pl5321hdbdd923_2    bioconda
perl-io-zlib              1.14            pl5321hdfd78af_0    bioconda
perl-json                 4.10            pl5321hdfd78af_1    bioconda
perl-json-xs              4.03            pl5321h4ac6f70_3    bioconda
perl-list-moreutils       0.430           pl5321hdfd78af_0    bioconda
perl-list-moreutils-xs    0.430           pl5321h031d066_2    bioconda
perl-parent               0.241           pl5321hd8ed1ab_0    conda-forge
perl-pathtools            3.75            pl5321h166bdaf_0    conda-forge
perl-scalar-list-utils    1.63            pl5321h166bdaf_0    conda-forge
perl-storable             3.15            pl5321h166bdaf_0    conda-forge
perl-types-serialiser     1.01            pl5321hdfd78af_0    bioconda
pillow                    10.4.0           py38h2bc05a7_0    conda-forge
pip                       24.2               pyh8b19718_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
python                    3.8.19          hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0.post0              pypi_0    pypi
python_abi                3.8                      5_cp38    conda-forge
pytz                      2024.1                   pypi_0    pypi
readline                  8.2                  h8228510_1    conda-forge
rpsbproc                  0.5.0                h6a68c12_0    bioconda
setuptools                72.2.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0                   pypi_0    pypi
tantan                    50                   h43eeafb_0    bioconda
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tzdata                    2024.1                   pypi_0    pypi
wget                      1.21.4               hda4d442_0    conda-forge
wheel                     0.44.0             pyhd8ed1ab_0    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

with modules:

Currently Loaded Modules:

1) GCCcore/5.4.0                   5) hwloc/1.11.3-GCC-5.4.0-2.26                   9) FFTW/3.3.4-gompi-2016b                                    13) libpng/1.2.59-foss-2016b
  2) binutils/2.26-GCCcore-5.4.0     6) OpenMPI/1.10.3-GCC-5.4.0-2.26                10) ScaLAPACK/2.0.2-gompi-2016b-OpenBLAS-0.2.18-LAPACK-3.6.1  14) Kent_tools/20190117-linux.x86_64
  3) GCC/5.4.0-2.26                  7) OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1  11) foss/2016b
  4) numactl/2.0.11-GCC-5.4.0-2.26   8) gompi/2016b                                  12) zlib/1.2.8-foss-2016b
RiversDong commented 2 months ago

I'm glad to hear that you've resolved this issue.

DiegoSafian commented 2 months ago

Hi again, I got an error in step Step IV: transform the maf file to axt files... ..

Step IV: Whole genome alignments...
  IV.1 construct the db index using bifurca_hic_LG.fasta.masked.mask
  IV.2 whole genome alignment between reference and focus in 24 processes
Step IV: transform the maf file to axt files...
Verbosity level: 1
foldThreshold: 0.000000    LRfoldThreshold: 2.500000   maxSuspectBases: 2147483647  maxSuspectScore: 100000  minBrokenChainScore: 50000  minLRGapSize: 0
ERROR: target 2bit file or nib directory ./gagetracker/results/2bit/bifurca_hic_LG.fasta.masked.2bit does not exist

Verbosity level: 1
foldThreshold: 0.000000    LRfoldThreshold: 2.500000   maxSuspectBases: 2147483647  maxSuspectScore: 100000  minBrokenChainScore: 50000  minLRGapSize: 0
ERROR: target 2bit file or nib directory./gagetracker/results/2bit/bifurca_hic_LG.fasta.masked.2bit does not exist

However, when I checked the ./gagetracker/results/2bit/ , I do find the 2.bit files but with a different name

-rw-r--r-- 1 safiand domain users 208M Sep  4 18:24 bifurca_hic_LG.fastaed.2bit
-rw-r--r-- 1 safiand domain users 208M Sep  4 18:28 formosa_LG_correcteded.fasta.2bit
-rw-r--r-- 1 safiand domain users 214M Sep  4 18:27 GCA__UBC_Ppic_1.0_genomic.fna.2bit
-rw-r--r-- 1 safiand domain users 483M Sep  4 18:26 GCF__GRCz11_genomic_danio_rerio.fna.2bit
-rw-r--r-- 1 safiand domain users 185M Sep  4 18:28 prolificasa_LG_corrected.fastaed.2bit
-rw-r--r-- 1 safiand domain users 179M Sep  4 18:28 turneriis_LG_corrected.fastaed.2bit
RiversDong commented 2 months ago

I suspect the issue might be with the naming of the input file. Your input file is named bifurca_hic_LG.fasta.masked, and after masking, the file is named bifurca_hic_LG.fasta.masked.mask.

In line 24 of rbhM.py, after replacing .masked with an empty string, the file name becomes bifurca_hic_LG.fastaed.2bit.

To resolve this error, you can remove .masked from the input file name (and make the corresponding changes in the CTL file as well).

Please let me know if this error still occurs

RiversDong commented 1 month ago

Will there still be any error messages at this step? If there are no further errors, I will close this issue.

DiegoSafian commented 1 month ago

Hi,

Thanks for asking. I have not finished running the pipeline yet, but the previous issues were solved.

I am now wondering if there is a way to reduce the storage of the alignment results. Using 11 genomes of between 0.5 to 2.5 G, the alignments files (in last and psl folder) generate 3 terabytes of data. Can I remove one of the two while running the pipeline??

DiegoSafian commented 1 month ago

Hi again, I am getting errors in the chain to chainCleaner steps. It appears that lines in the .chain files in some species are missing information. I am also getting errors about the .2bit. Please, see the attachment for more information. age_dating.txt

RiversDong commented 1 month ago

Apologies for the delayed response.

Could you please provide me with your GTF file, the genome of interest, and two reference genome files (perhaps you could upload the files to Google Drive and then share the link with me. This way, I can access them)?

I may require additional time to thoroughly investigate and determine the cause of the issue according to your provide data.

DiegoSafian commented 1 month ago

Ok. Can I get an email to send you the link?

RiversDong commented 1 month ago

chuand@whu.edu.cn