Open El-Castor opened 3 years ago
Hi!
Definitely something goes wrong. What is the version of nucdiff and if you use it through conda or not?
Hi Kseniakh,
thank you for your quick respons
here you have the installed dependancy using conda. Moreover I install NucDiff and all the dependancy using conda as descrive in your documentation.
# packages in environment at /opt/share/FLOCAD/userspace/cpichot/miniconda3/envs/NuCdiff:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
biopython 1.78 py39h3811e60_1 conda-forge
ca-certificates 2020.12.5 ha878542_0 conda-forge
certifi 2020.12.5 py39hf3d152e_1 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_1 conda-forge
libblas 3.9.0 7_openblas conda-forge
libcblas 3.9.0 7_openblas conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 9.3.0 h5dbcf3e_17 conda-forge
libgfortran-ng 9.3.0 he4bcb1c_17 conda-forge
libgfortran5 9.3.0 he4bcb1c_17 conda-forge
libgomp 9.3.0 h5dbcf3e_17 conda-forge
liblapack 3.9.0 7_openblas conda-forge
libopenblas 0.3.12 pthreads_h4812303_1 conda-forge
libstdcxx-ng 9.3.0 h2ae2ef3_17 conda-forge
mummer 3.23 4 bioconda
ncurses 6.2 h58526e2_4 conda-forge
nucdiff 2.0.3 pyh864c0ab_1 bioconda
numpy 1.19.5 py39hdbf815f_1 conda-forge
openssl 1.1.1i h7f98852_0 conda-forge
perl 5.32.0 h36c2ea0_0 conda-forge
perl-threaded 5.26.0 0 bioconda
pip 20.3.3 pyhd8ed1ab_0 conda-forge
python 3.9.1 hffdb5ce_3_cpython conda-forge
python_abi 3.9 1_cp39 conda-forge
readline 8.0 he28a2e2_2 conda-forge
setuptools 49.6.0 py39hf3d152e_3 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
tzdata 2020f he74cb21_0 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
Hi!
Can you list the files that were generated and which of them are empty?
When everything hangs up, does the "top" command report that python command is running or nothing is running at this moment?
What is the size of compared genomes and how many sequences in each ?
Ksenia
Hi !
here you have a print of the produced file :
-rw-r--r-- 1 cpichot utilisateurs 18869278 janv. 15 11:52 comp_DHL92_vs_CMiso_NuCDiff_1.coord
-rw-r--r-- 1 cpichot utilisateurs 385070870 janv. 15 14:37 comp_DHL92_vs_CMiso_NuCDiff_1.snps
-rw-r--r-- 1 cpichot utilisateurs 153201 janv. 15 16:28 comp_DHL92_vs_CMiso_NuCDiff_2.coord
-rw-r--r-- 1 cpichot utilisateurs 6366388 janv. 15 16:30 comp_DHL92_vs_CMiso_NuCDiff_2.snps
-rw-r--r-- 1 cpichot utilisateurs 21259590 janv. 15 11:51 comp_DHL92_vs_CMiso_NuCDiff.coords
-rw-r--r-- 1 cpichot utilisateurs 49649790 janv. 15 11:45 comp_DHL92_vs_CMiso_NuCDiff.delta
-rw-r--r-- 1 cpichot utilisateurs 15699135 janv. 15 11:51 comp_DHL92_vs_CMiso_NuCDiff.filter
-rw-r--r-- 1 cpichot utilisateurs 0 janv. 15 11:51 comp_DHL92_vs_CMiso_NuCDiff_filtered.snps
As you can see the file _filetered.snps are empty.
I don't under stand what you mean when you ask "does the "top" command report that python command is running or nothing is running at this moment?", but I don't see any error message when I launch NuCdiff.
The size of the two genomes are more or less at 450 Mb for both, Do you think that I have allocated no sufficient ram ?
Hi!
It is OK that the file _filtered.snps is empty.
As I can see, the tool has found the variances inside the fragments, but not between. To exclude the most obvious reasons, I need to know the following information:
It would be great if you also could run the tool with some small genomes (or just two almost similar sequences) to check if you get the same problem.
Hi Ksenia,
(base) cpichot@node15:/NetScratch/cpichot/genome_publi_ultimate_analysis/polymorphism_between_CmeloPublishedGenome/out$ wc -l comp_DHL92_vs_CMiso_NuCDiff_1.coord
124858 comp_DHL92_vs_CMiso_NuCDiff_1.coord
(base) cpichot@node15:/NetScratch/cpichot/genome_publi_ultimate_analysis/polymorphism_between_CmeloPublishedGenome/out$ wc -l comp_DHL92_vs_CMiso_NuCDiff_2.coord
1021 comp_DHL92_vs_CMiso_NuCDiff_2.coord
here a print of the results folder :
1433620 janv. 25 20:11 comp_DHL92_vs_CMiso_NuCDiff_test_1.coord
45864485 janv. 25 20:11 comp_DHL92_vs_CMiso_NuCDiff_test_1.snps
39904 janv. 25 20:29 comp_DHL92_vs_CMiso_NuCDiff_test_2.coord
1205715 janv. 25 20:29 comp_DHL92_vs_CMiso_NuCDiff_test_2.snps
1533380 janv. 25 20:11 comp_DHL92_vs_CMiso_NuCDiff_test.coords
34719606 janv. 25 18:40 comp_DHL92_vs_CMiso_NuCDiff_test.delta
957730 janv. 25 20:11 comp_DHL92_vs_CMiso_NuCDiff_test.filter
0 janv. 25 20:11 comp_DHL92_vs_CMiso_NuCDiff_test_filtered.snps
and the results folder :
-rw-r--r-- 1 cpichot utilisateurs 165114 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_query_additional.gff
-rw-r--r-- 1 cpichot utilisateurs 252114 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_query_blocks.gff
-rw-r--r-- 1 cpichot utilisateurs 21371054 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_query_snps.gff
-rw-r--r-- 1 cpichot utilisateurs 1093183 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_query_struct.gff
-rw-r--r-- 1 cpichot utilisateurs 227771 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_ref_additional.gff
-rw-r--r-- 1 cpichot utilisateurs 151191 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_ref_blocks.gff
-rw-r--r-- 1 cpichot utilisateurs 21721331 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_ref_snps.gff
-rw-r--r-- 1 cpichot utilisateurs 1145023 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_ref_struct.gff
-rw-r--r-- 1 cpichot utilisateurs 800 janv. 25 23:15 comp_DHL92_vs_CMiso_NuCDiff_test_stat.out
Hi ,
I know that NucDiff may require time if there are a lot of structural differences between two specific sequences. Probably, this is the case here. I would rather run the tool one more time and wait a bit longer. The alternative is to run the tool for each chromosome separately. It is not optimal, but at least it will speed up the process, narrow down the problem, if it exists, and show what chromosome cause the problem.
Hi,
I was trying your tool from MUMer outp (delta file) to get all SNPs and gap between two genome.
here you have the command line that I used
The tools block after saying this message and until tw day is still at the same step. The file .snp is not empty but the filtered.snp soft yes.
here the console message :
Do you have any suggestion ?
Thanks in advance!