Suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments. See full GimmeMotifs documentation for detailed installation instructions and usage examples.
Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa can be downloaded from
here.
Expected behavior
I expected the program to run in full or to provide a concise error message.
Error logs
2024-03-17 11:07:42,884 - INFO - Starting maelstrom 2024-03-17 11:07:42,890 - INFO - motif scanning (counts)
2024-03-17 11:07:42,890 - INFO - reading table
2024-03-17 11:07:45,717 - INFO - using 14000 sequences
2024-03-17 11:08:34,427 - INFO - setting threshold
Determining FPR-based threshold: 100%|██████████████████████████████████████████████████████████████| 10633/10633 [12:40<00:00, 13.98 sequences/s]
2024-03-17 11:21:23,647 - INFO - creating count table
Scanning: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:09<00:00, 9.37s/ sequences]
2024-03-17 11:21:33,022 - INFO - done
2024-03-17 11:21:33,022 - INFO - creating dataframe
2024-03-17 11:21:33,435 - INFO - motif scanning (scores)
2024-03-17 11:21:33,435 - INFO - reading table
2024-03-17 11:21:39,620 - INFO - using 14000 sequences
2024-03-17 11:22:13,126 - INFO - creating score table (z-score, GC%)
Determining mean and stddev for motifs: 100%|██████████████████████████████████████████████████████████| 19756/19756 [11:18<00:00, 29.13 motifs/s]
Scanning: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.55s/ sequences]
2024-03-17 11:33:43,722 - INFO - done
2024-03-17 11:33:43,722 - INFO - creating dataframe
2024-03-17 11:33:44,235 - INFO - Selecting non-redundant motifs
Traceback (most recent call last):
File "/home/mabe/.conda/envs/mabe/bin/gimme", line 12, in <module>
cli(sys.argv[1:])
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/gimmemotifs/cli.py", line 755, in cli
args.func(args)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/gimmemotifs/commands/maelstrom.py", line 42, in maelstrom
run_maelstrom(
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/gimmemotifs/maelstrom/__init__.py", line 239, in run_maelstrom
fa.fit(scores)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/sklearn/base.py", line 1474, in wrapper
return fit_method(estimator, *args, **kwargs)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/sklearn/cluster/_agglomerative.py", line 1329, in fit
super()._fit(X.T)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/sklearn/cluster/_agglomerative.py", line 1066, in _fit
out = memory.cache(tree_builder)(
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/joblib/memory.py", line 353, in __call__
return self.func(*args, **kwargs)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/sklearn/cluster/_agglomerative.py", line 706, in _complete_linkage
return linkage_tree(*args, **kwargs)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/sklearn/cluster/_agglomerative.py", line 585, in linkage_tree
out = hierarchy.linkage(X, method=linkage, metric=affinity)
File "/home/mabe/.conda/envs/mabe/lib/python3.10/site-packages/scipy/cluster/hierarchy.py", line 1030, in linkage raise ValueError("The condensed distance matrix must contain only "
ValueError: The condensed distance matrix must contain only finite values.
Installation information (please complete the following information):
OS: [Ubuntu 22.04.4 LTS]
Installation [conda]
Version [0.18.0]
Additional context
As I am new to the software, this might as well be an error on my side (or maybe the statistics just doesn't work out on a single gene) . Still, I think that the error handling/error message should be better!
Describe the bug In order to perform differential enrichment, I wanted to try on a single gene, but gimme crashes.
To Reproduce Steps to reproduce the behavior:
IFIH1.fasta is attached: IFIH1.fasta.txt
Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa can be downloaded from here.
Expected behavior I expected the program to run in full or to provide a concise error message.
Error logs
Installation information (please complete the following information):
Additional context As I am new to the software, this might as well be an error on my side (or maybe the statistics just doesn't work out on a single gene) . Still, I think that the error handling/error message should be better!