Open ajinkyakhilari opened 3 years ago
Hello, I have exactly the same error do you resolve it ? Regards, Benjamin Penaud
Hey, I tried the test run with conda and my run crashed at the same spot.
This is the error from the specific working directory:
UMAP(verbose=2)
Construct fuzzy simplicial set
Tue Feb 16 16:16:30 2021 Finding Nearest Neighbors
Tue Feb 16 16:16:33 2021 Finished Nearest Neighbor Search
Tue Feb 16 16:16:35 2021 Construct embedding
completed 0 / 500 epochs
completed 50 / 500 epochs
completed 100 / 500 epochs
completed 150 / 500 epochs
completed 200 / 500 epochs
completed 250 / 500 epochs
completed 300 / 500 epochs
completed 350 / 500 epochs
completed 400 / 500 epochs
completed 450 / 500 epochs
Tue Feb 16 16:16:42 2021 Finished embedding
Traceback (most recent call last):
File "/cluster/work/users/thhaverk/nanoclust_tmp/fe/f4dc7167db2f6187bd0d5bf4ecc692/.command.sh", line 26, in <module>
plt.figure(figsize=(20,20))
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-800e1e27475cbaa0538f834c4aacc420/lib/python3.8/site-packages/matplotlib/pyplot.py", line 671, in figure
figManager = new_figure_manager(num, figsize=figsize,
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-800e1e27475cbaa0538f834c4aacc420/lib/python3.8/site-packages/matplotlib/pyplot.py", line 299, in new_figure_manager
return _backend_mod.new_figure_manager(*args, **kwargs)
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-800e1e27475cbaa0538f834c4aacc420/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 3494, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-800e1e27475cbaa0538f834c4aacc420/lib/python3.8/site-packages/matplotlib/backends/_backend_tk.py", line 868, in new_figure_manager_given_figure
window = tk.Tk(className="matplotlib")
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-800e1e27475cbaa0538f834c4aacc420/lib/python3.8/tkinter/__init__.py", line 2261, in __init__
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display "158.36.42.36:25.0"
Any idea how to solve it?
Okay, that did not work for me. Can you explain why you found that that package was needed?
When I check the error, I see this:
Command error:
Traceback (most recent call last):
File ".command.sh", line 26, in <module>
plt.figure(figsize=(20,20))
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-165c04fe82a861f4b9dc6382a66f5ed7/lib/python3.8/site-packages/matplotlib/pyplot.py", line 671, in figure
figManager = new_figure_manager(num, figsize=figsize,
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-165c04fe82a861f4b9dc6382a66f5ed7/lib/python3.8/site-packages/matplotlib/pyplot.py", line 299, in new_figure_manager
return _backend_mod.new_figure_manager(*args, **kwargs)
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-165c04fe82a861f4b9dc6382a66f5ed7/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 3494, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-165c04fe82a861f4b9dc6382a66f5ed7/lib/python3.8/site-packages/matplotlib/backends/_backend_tk.py", line 868, in new_figure_manager_given_figure
window = tk.Tk(className="matplotlib")
File "/cluster/work/users/thhaverk/nanoclust_tmp/conda/read_clustering-165c04fe82a861f4b9dc6382a66f5ed7/lib/python3.8/tkinter/__init__.py", line 2261, in __init__
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display "158.36.42.36:25.0"
especially the last line, which is an IP address of a display, Why is that needed? I am working on a HPC cluster, so no other display arround then my terminal.
I will check the docker option
Okay, I solved my issues by modifying the nextflow.config file to use singularity instead of docker. I added a singularity process to the processes part (see below). I work on a HPC cluster where we are not allowed to use docker. But I can use singularity with docker images.
This is the modified nextflow.config file for me:
profiles {
test { includeConfig 'conf/test.config' }
conda {
process {
withName: demultiplex { conda = "$baseDir/conda_envs/demultiplex/environment.yml" }
withName: demultiplex_porechop { conda = "$baseDir/conda_envs/demultiplex_porechop/environment.yml" }
withName: QC { conda = "$baseDir/conda_envs/qc_fastp/environment.yml" }
withName: fastqc { conda = "$baseDir/conda_envs/fastqc/environment.yml" }
withName: multiqc { conda = "$baseDir/conda_envs/fastqc/environment.yml" }
withName: kmer_freqs { conda = "$baseDir/conda_envs/kmer_freqs/environment.yml" }
withName: read_clustering { conda = "$baseDir/conda_envs/read_clustering/environment.yml" }
withName: split_by_cluster { conda = "$baseDir/conda_envs/split_by_cluster/environment.yml" }
withName: read_correction { conda = "$baseDir/conda_envs/read_correction/environment.yml" }
withName: draft_selection { conda = "$baseDir/conda_envs/draft_selection/environment.yml" }
withName: racon_pass { conda = "$baseDir/conda_envs/racon_pass/environment.yml" }
withName: medaka_pass { conda = "$baseDir/conda_envs/medaka_pass/environment.yml" }
withName: consensus_classification { conda = "$baseDir/conda_envs/consensus_classification/environment.yml" }
withName: get_abundances { conda = "$baseDir/conda_envs/cluster_plot_pool/environment.yml" }
withName: plot_abundances { conda = "$baseDir/conda_envs/cluster_plot_pool/environment.yml" }
withName: output_documentation { conda = "$baseDir/conda_envs/output_documentation/environment.yml" }
}
}
docker {
docker.enabled = true
//process.container = 'nf-core/nanoclust:latest'
process {
withName: demultiplex { container = 'hecrp/nanoclust-demultiplex' }
withName: demultiplex_porechop { container = 'hecrp/nanoclust-demultiplex_porechop' }
withName: QC { container = 'hecrp/nanoclust-qc' }
withName: fastqc { container = 'hecrp/nanoclust-fastqc' }
withName: multiqc { container = 'hecrp/nanoclust-fastqc' }
withName: kmer_freqs { container = 'hecrp/nanoclust-kmer_freqs' }
withName: read_clustering { container = 'hecrp/nanoclust-read_clustering' }
withName: split_by_cluster { container = 'hecrp/nanoclust-split_by_cluster' }
withName: read_correction { container = 'hecrp/nanoclust-read_correction' }
withName: draft_selection { container = 'hecrp/nanoclust-draft_selection' }
withName: racon_pass { container = 'hecrp/nanoclust-racon_pass' }
withName: medaka_pass { container = 'hecrp/nanoclust-medaka_pass' }
withName: consensus_classification { container = 'hecrp/nanoclust-consensus_classification'
docker.temp = "$baseDir/" }
withName: get_abundances { container = 'hecrp/nanoclust-plot_abundances' }
withName: plot_abundances { container = 'hecrp/nanoclust-plot_abundances' }
withName: output_documentation { container = 'hecrp/nanoclust-output_documentation' }
}
}
singularity {
singularity.enabled = true
singularity.autoMounts = true
//process.container = 'nf-core/nanoclust:latest'
process {
withName: demultiplex { container = 'docker://hecrp/nanoclust-demultiplex' }
withName: demultiplex_porechop { container = 'docker://hecrp/nanoclust-demultiplex_porechop' }
withName: QC { container = 'docker://hecrp/nanoclust-qc' }
withName: fastqc { container = 'docker://hecrp/nanoclust-fastqc' }
withName: multiqc { container = 'docker://hecrp/nanoclust-fastqc' }
withName: kmer_freqs { container = 'docker://hecrp/nanoclust-kmer_freqs' }
withName: read_clustering { container = 'docker://hecrp/nanoclust-read_clustering' }
withName: split_by_cluster { container = 'docker://hecrp/nanoclust-split_by_cluster' }
withName: read_correction { container = 'docker://hecrp/nanoclust-read_correction' }
withName: draft_selection { container = 'docker://hecrp/nanoclust-draft_selection' }
withName: racon_pass { container = 'docker://hecrp/nanoclust-racon_pass' }
withName: medaka_pass { container = 'docker://hecrp/nanoclust-medaka_pass' }
withName: consensus_classification { container = 'docker://hecrp/nanoclust-consensus_classification'
singularity.temp = "$baseDir/" }
withName: get_abundances { container = 'docker://hecrp/nanoclust-plot_abundances' }
withName: plot_abundances { container = 'docker://hecrp/nanoclust-plot_abundances' }
withName: output_documentation { container = 'docker://hecrp/nanoclust-output_documentation' }
}
}
}
I had the same issue under conda environment, in my case it seems to have stemmed from the version discrepancy between the ./conda_env/read_clustering/environment.yml
file and the repository. I modified the version of hdbscan
and umap-learn
to the newest version found with conda search <package>
and it is working fine now.
@hoohugokim Thank you for your comment. In my case worked removing all package versions specified. Finally got through that step.
executor > local (5) [d8/6fd53d] process > QC (1) [100%] 1 of 1 ✔ [80/44cf34] process > fastqc (1) [100%] 1 of 1 ✔ [0c/7b75e3] process > kmer_freqs (1) [100%] 1 of 1 ✔ [cb/16e8b0] process > read_clustering (1) [100%] 1 of 1, failed: 1 ✘ [- ] process > split_by_cluster - [- ] process > read_correction - [- ] process > draft_selection - [- ] process > racon_pass - [- ] process > medaka_pass - [- ] process > consensus_classification - [- ] process > join_results - [- ] process > get_abundances - [- ] process > plot_abundances - [12/0efcc7] process > output_documentation [100%] 1 of 1 ✔ Error executing process > 'read_clustering (1)'
Caused by: Process
read_clustering (1)
terminated with an error exit status (1)Command executed [/NanoporeTools/NanoCLUST/templates/umap_hdbscan.py]:
!/usr/bin/env python
import numpy as np import umap import matplotlib.pyplot as plt from sklearn import decomposition import random import pandas as pd import hdbscan
df = pd.read_csv("freqs.txt", delimiter="is_Pr")
UMAP
motifs = [x for x in df.columns.values if x not in ["read", "length"]] X = df.loc[:,motifs] X_embedded = umap.UMAP(n_neighbors=15, min_dist=0.1, verbose=2).fit_transform(X)
df_umap = pd.DataFrame(X_embedded, columns=["D1", "D2"]) umap_out = pd.concat([df["read"], df["length"], df_umap], axis=1)
HDBSCAN
X = umap_out.loc[:,["D1", "D2"]] umap_out["bin_id"] = hdbscan.HDBSCAN(min_cluster_size=int(200), cluster_selection_epsilon=int(0.5)).fit_predict(X)
PLOT
plt.figure(figsize=(20,20)) plt.scatter(X_embedded[:, 0], X_embedded[:, 1], c=umap_out["bin_id"], cmap='Spectral', s=1) plt.xlabel("UMAP1", fontsize=18) plt.ylabel("UMAP2", fontsize=18) plt.gca().set_aspect('equal', 'datalim') plt.title("Projecting " + str(len(umap_out['bin_id'])) + " reads. " + str(len(umap_out['bin_id'].unique())) + " clusters generated by HDBSCAN", fontsize=18)
for cluster in np.sort(umap_out['bin_id'].unique()): read = umap_out.loc[umap_out['bin_id'] == cluster].iloc[0] plt.annotate(str(cluster), (read['D1'], read['D2']), weight='bold', size=14)
plt.savefig('hdbscan.output.png') umap_out.to_csv("hdbscan.output.tsv", sep=" ", index=False)
Command exit status: 1
Command output: UMAP(verbose=2) Construct fuzzy simplicial set Fri Jan 29 12:06:46 2021 Finding Nearest Neighbors Fri Jan 29 12:06:46 2021 Building RP forest with 21 trees Fri Jan 29 12:06:49 2021 NN descent for 17 iterations 1 / 17 2 / 17 3 / 17 4 / 17 5 / 17 6 / 17 7 / 17 8 / 17 Stopping threshold met -- exiting after 8 iterations Fri Jan 29 12:07:08 2021 Finished Nearest Neighbor Search Fri Jan 29 12:07:10 2021 Construct embedding completed 0 / 200 epochs completed 20 / 200 epochs completed 40 / 200 epochs completed 60 / 200 epochs completed 80 / 200 epochs completed 100 / 200 epochs completed 120 / 200 epochs completed 140 / 200 epochs completed 160 / 200 epochs completed 180 / 200 epochs Fri Jan 29 12:08:08 2021 Finished embedding
Command error: Traceback (most recent call last): File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/read_clustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/joblib/parallel.py", line 820, in dispatch_one_batch tasks = self._ready_batches.get(block=False) File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/read_clustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/queue.py", line 167, in get raise Empty _queue.Empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File ".command.sh", line 23, in
umap_out["bin_id"] = hdbscan.HDBSCAN(min_cluster_size=int(200), cluster_selection_epsilon=int(0.5)).fit_predict(X)
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/readclustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/hdbscan/hdbscan.py", line 941, in fit_predict
self.fit(X)
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/readclustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/hdbscan/hdbscan.py", line 919, in fit
self._min_spanning_tree) = hdbscan(X, *kwargs)
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/readclustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/hdbscan/hdbscan.py", line 610, in hdbscan
(single_linkage_tree, result_min_span_tree) = memory.cache(
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/read_clustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/joblib/memory.py", line 352, in call
return self.func(args, **kwargs)
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/readclustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/hdbscan/hdbscan.py", line 275, in _hdbscan_boruvka_kdtree
alg = KDTreeBoruvkaAlgorithm(tree, min_samples, metric=metric,
File "hdbscan/_hdbscan_boruvka.pyx", line 375, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm.init
File "hdbscan/_hdbscan_boruvka.pyx", line 411, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm._compute_bounds
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/read_clustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/joblib/parallel.py", line 1041, in call
if self.dispatch_one_batch(iterator):
File "/home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/conda/read_clustering-998d6264058a39a660addfff9962d1f9/lib/python3.8/site-packages/joblib/parallel.py", line 831, in dispatch_one_batch
islice = list(itertools.islice(iterator, big_batch_size))
File "hdbscan/_hdbscan_boruvka.pyx", line 412, in genexpr
TypeError: delayed() got an unexpected keyword argument 'check_pickle'
Work dir: /home/administrator/Desktop/Bovine_Mastitis_Project/Mastitis_nanopore_data/Project_1/Project1/Project1/20190612_1214_MN26935_FAK72557_229de4aa/work/cb/16e8b0a8d65a824ccac0a1378149f9
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run