Open spleonard1 opened 9 years ago
Hi,
I apologize for the super slow response - somehow missed this.
Are you still having this problem? If you could send me the output of "ls -l" for hashed_reads/ and cluster_vectors/ that would help me to diagnose the problem. Sometimes a failure in an earlier step could lead to this problem.
On Wed, Nov 4, 2015 at 1:42 AM, Sean Leonard notifications@github.com wrote:
I ran through the test data set without issue, and am using your scripts on a metagenomic data set ~100 GB. Our cluster uses SLURM job submission, so I'm trying to do a dry run on my desktop (mac) before adapting the scripts. Haven't had any real issues until the "Calculating the SVD (streaming!)" step, when I got following error/traceback. I don't have any experience with Pyro4, but it looks like it doesn't know how to "serialize" the bumpy array. Any ideas? Let me know if I can provide you any other helpful information.
python LSA/kmer_lsi.py -i ./hashed_reads/ -o ./cluster_vectors/ 2015-11-04 00:31:32,643 : INFO : using distributed version with 5 workers 2015-11-04 00:31:32,643 : INFO : updating model with new documents 2015-11-04 00:31:32,643 : INFO : initializing 5 workers 2015-11-04 00:31:33,112 : INFO : preparing a new chunk of documents Traceback (most recent call last): File "LSA/kmer_lsi.py", line 41, in lsi = hashobject.train_kmer_lsi(corpus,num_dims=len(hashobject.path_dict)_4/5,single=singleInstance) File "/Users/seanleonard/Desktop/LatentStrainAnalysis/LSA/streaming_eigenhashes.py", line 82, in train_kmer_lsi return models.LsiModel(kmer_corpus,num_topics=num_dims,id2word=self.pathdict,distributed=True,chunksize=200000) File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 329, in *init self.add_documents(corpus) File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 382, in add_documents self.dispatcher.putjob(job) # put job into queue; this will eventually block, because the queue has a small finite size File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 171, in call return self.
_send(self.__name, args, kwargs) File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 394, in _pyroInvoke compress=Pyro4.config.COMPRESSION) File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 167, in serializeCall data = self.dumpsCall(obj, method, vargs, kwargs) File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 476, in dumpsCall return serpent.dumps((obj, method, vargs, kwargs), module_in_classname=True) File "/Library/Python/2.7/site-packages/serpent.py", line 78, in dumps return Serializer(indent, set_literals, module_in_classname).serialize(obj) File "/Library/Python/2.7/site-packages/serpent.py", line 250, in serialize self._serialize(obj, out, 0) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple serialize(elt, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple serialize(elt, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize f(self, obj, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 552, in ser_default_class self._serialize(value, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 431, in ser_builtins_dict serialize(v, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize f(self, obj, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 551, in ser_default_class raise TypeError("don't know how to serialize class " + str(obj._class) + ". Give it vars() or an appropriate getstate") TypeError: don't know how to serialize class . Give it vars() or an appropriate getstate
Thanks!
— Reply to this email directly or view it on GitHub https://github.com/brian-cleary/LatentStrainAnalysis/issues/7.
Also - you might want to try running the test data using the streaming svd (locally), just to see if you have the correct environment.
On Tue, Dec 1, 2015 at 10:12 AM, Brian Cleary cleary.brian@gmail.com wrote:
Hi,
I apologize for the super slow response - somehow missed this.
Are you still having this problem? If you could send me the output of "ls -l" for hashed_reads/ and cluster_vectors/ that would help me to diagnose the problem. Sometimes a failure in an earlier step could lead to this problem.
On Wed, Nov 4, 2015 at 1:42 AM, Sean Leonard notifications@github.com wrote:
I ran through the test data set without issue, and am using your scripts on a metagenomic data set ~100 GB. Our cluster uses SLURM job submission, so I'm trying to do a dry run on my desktop (mac) before adapting the scripts. Haven't had any real issues until the "Calculating the SVD (streaming!)" step, when I got following error/traceback. I don't have any experience with Pyro4, but it looks like it doesn't know how to "serialize" the bumpy array. Any ideas? Let me know if I can provide you any other helpful information.
python LSA/kmer_lsi.py -i ./hashed_reads/ -o ./cluster_vectors/ 2015-11-04 00:31:32,643 : INFO : using distributed version with 5 workers 2015-11-04 00:31:32,643 : INFO : updating model with new documents 2015-11-04 00:31:32,643 : INFO : initializing 5 workers 2015-11-04 00:31:33,112 : INFO : preparing a new chunk of documents Traceback (most recent call last): File "LSA/kmer_lsi.py", line 41, in lsi = hashobject.train_kmer_lsi(corpus,num_dims=len(hashobject.path_dict)_4/5,single=singleInstance) File "/Users/seanleonard/Desktop/LatentStrainAnalysis/LSA/streaming_eigenhashes.py", line 82, in train_kmer_lsi return models.LsiModel(kmer_corpus,num_topics=num_dims,id2word=self.pathdict,distributed=True,chunksize=200000) File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 329, in *init self.add_documents(corpus) File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 382, in add_documents self.dispatcher.putjob(job) # put job into queue; this will eventually block, because the queue has a small finite size File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 171, in call return self.
_send(self.__name, args, kwargs) File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 394, in _pyroInvoke compress=Pyro4.config.COMPRESSION) File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 167, in serializeCall data = self.dumpsCall(obj, method, vargs, kwargs) File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 476, in dumpsCall return serpent.dumps((obj, method, vargs, kwargs), module_in_classname=True) File "/Library/Python/2.7/site-packages/serpent.py", line 78, in dumps return Serializer(indent, set_literals, module_in_classname).serialize(obj) File "/Library/Python/2.7/site-packages/serpent.py", line 250, in serialize self._serialize(obj, out, 0) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple serialize(elt, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple serialize(elt, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize f(self, obj, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 552, in ser_default_class self._serialize(value, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize return self.dispatcht http://self,%20obj,%20out,%20level File "/Library/Python/2.7/site-packages/serpent.py", line 431, in ser_builtins_dict serialize(v, out, level + 1) File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize f(self, obj, out, level) File "/Library/Python/2.7/site-packages/serpent.py", line 551, in ser_default_class raise TypeError("don't know how to serialize class " + str(obj._class) + ". Give it vars() or an appropriate getstate") TypeError: don't know how to serialize class . Give it vars() or an appropriate getstate
Thanks!
— Reply to this email directly or view it on GitHub https://github.com/brian-cleary/LatentStrainAnalysis/issues/7.
Hi, I am having the same issue. The software worked fine with the test data that you provided. Here is the error message:
Here is the worker error log file:
And here are the output from ls -l for hashed_reads/ and cluster_vector/:
ls_-l_cluster_vectors.txt ls_-l_hashed_reads.txt
Does it have something to do with this? https://pythonhosted.org/Pyro4/tipstricks.html#pyro-and-numpy
Maybe something in my environment like you said above? I had to edit the KmerLSI.py script for SGE:
Thanks again, Brian!
Hm. So it's still not clear to me if it's the environment.
Can you see if you are able to run the distributed version of the SVD on the test data? The "Getting started" stuff uses the single instance version of the SVD, but you should be able to run the test data up to that point, and then try the SVD with a couple of different workers.
This will help us clarify if you can run that portion of the code at all, or if there is maybe something funky in the data that is fed into the SVD.
Also, I note that you're setting up your PATH after starting all Pyro processes...not sure if this is significant (or intentional).
On Mon, Dec 14, 2015 at 3:07 PM, russianconcussion <notifications@github.com
wrote:
Hi, I am having the same issue. Here is the error message:
— Reply to this email directly or view it on GitHub https://github.com/brian-cleary/LatentStrainAnalysis/issues/7#issuecomment-164544723 .
Okay, I'll check those: first, setting up the path before starting Pyro4, then running the test data with a few workers. Thanks, Brian!
Hi, Brian,
I killed two birds with one stone and exported my PATH before running Pyro4 on the test data (the exact same files that the single instance was able to process). Here is the output, which looks pretty much the same. It's still complaining about numpy:
KmerLSI_path1st.err.txt KmerLSI_path1st.out.txt
I am running my large dataset with the single thread version of KmerLSI and it is working fine. I'm pretty sure that there is something wrong with the multi-processing. How much more time would it take to run the single instance version (5x?)?
No problem with the multithreading; that was a red herring. Just trouble with Pyro4.
Hi, Brian,
Okay, so I've been able to successfully run a 2nd large dataset (~240Gb) through kmer_lsi.py on single-thread mode, but it takes days. The problem with using Pyro4 still persists with this dataset, though.
Are you sure that this isn't the problem: https://pythonhosted.org/Pyro4/tipstricks.html#pyro-and-numpy
Is there any way I can edit kmer_lsi.py to test if this is the problem?
Thanks for your time!
Just wondering if this was ever solved? Even though it seems to successfully run on the test dataset provided with the repo, it seems to produce the error for my test dataset.
Hi, @sunitj, FWIW I'm afraid I wasn't able to solve it myself and I haven't heard from Brian Cleary since December. It's pretty clear to me that the lsi script works fine in single-core mode (as it's run on the test dataset, but also works fine on a large dataset, just takes a long weekend), but I cannot get it to work on multiple cores/nodes with the Pyro4 library. Pyro4 complains about the numpy format, but I'm not sure exactly how to fix this in kmer_lsi.py.
I'm seeing the same issue. As @russianconcussion suggested, I'm simply running it single threaded for now.
I've gotten past the serialization error by inserting the following before launching the pyro4 nameserver:
export PYRO_SERIALIZERS_ACCEPTED=serpent,json,marshal,pickle
export PYRO_SERIALIZER=pickle
I'm now getting this error:
Date: Wed Feb 10 08:47:29 HST 2016
2016-02-10 08:48:03,862 : ERROR : failed to initialize distributed LSI (unknown name: gensim.lsi_dispatcher)
Traceback (most recent call last):
File "LSA/kmer_lsi.py", line 41, in <module>
lsi = hashobject.train_kmer_lsi(corpus,num_dims=len(hashobject.path_dict)*4/5,single=singleInstance)
File "/mnt/lysine/assemblies/CSHLII/eigenomes/LatentStrainAnalysis/LSA/streaming_eigenhashes.py", line 82, in train_kmer_lsi
return models.LsiModel(kmer_corpus,num_topics=num_dims,id2word=self.path_dict,distributed=True,chunksize=200000)
File "/opt/virtualenv/eigengenomes/lib/python2.7/site-packages/gensim/models/lsimodel.py", line 326, in __init__
raise RuntimeError("failed to initialize distributed LSI (%s)" % err)
RuntimeError: failed to initialize distributed LSI (unknown name: gensim.lsi_dispatcher)
Date: Wed Feb 10 08:48:03 HST 2016
This looks like a collision of nameservers and may be unique to my setup. @russianconcussion and @sunitj, let me know if this gets you anywhere.
I have it working.
I am not using a distributed cluster, though, just a monolithic multi-core machine with loads of RAM. This does enable multithreading though.
I generated the stock bash script with python LSFScripts/create_jobs.py -j KmerLSI -i ./
and added the following lines to the generated script LSFScripts/KmerLSI_Job.q
just before the first python call:
export PYRO_SERIALIZERS_ACCEPTED=serpent,json,marshal,pickle
export PYRO_SERIALIZER=pickle
export PYRO_NS_HOST=localhost
export PYRO_NS_PORT=65431
export PYRO_HOST=localhost
...and executed th script with bash LSFScripts/KmerLSI_Job.q
I was having problems with address collisions on my server (hence the HOST and PORT settings) and with serialization as @russionconcussion and @sunitj experienced.
I encountered the same issue using @russianconcussion's SGE scripts on an Amazon Web Services starcluster. @jmeppley's workaround solved my multi-threading problem, and I also used screen and ran the bash command in the background for convenience. Thanks all!
Does anyone know if @jmeppley's solution will work on a distributed system? If not, how can it be adapted to do so?
I ran through the test data set without issue, and am using your scripts on a metagenomic data set ~100 GB. Our cluster uses SLURM job submission, so I'm trying to do a dry run on my desktop (mac) before adapting the scripts. Haven't had any real issues until the "Calculating the SVD (streaming!)" step, when I got following error/traceback. I don't have any experience with Pyro4, but it looks like it doesn't know how to "serialize" the numpy array. Any ideas? Let me know if I can provide you any other helpful information.
lsi = hashobject.train_kmer_lsi(corpus,num_dims=len(hashobject.path_dict)*4/5,single=singleInstance)
File "/Users/seanleonard/Desktop/LatentStrainAnalysis/LSA/streaming_eigenhashes.py", line 82, in train_kmer_lsi
return models.LsiModel(kmer_corpus,num_topics=num_dims,id2word=self.path_dict,distributed=True,chunksize=200000)
File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 329, in init
self.add_documents(corpus)
File "/Library/Python/2.7/site-packages/gensim/models/lsimodel.py", line 382, in add_documents
self.dispatcher.putjob(job) # put job into queue; this will eventually block, because the queue has a small finite size
File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 171, in call
return self.send(self.__name, args, kwargs)
File "/Library/Python/2.7/site-packages/Pyro4/core.py", line 394, in _pyroInvoke
compress=Pyro4.config.COMPRESSION)
File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 167, in serializeCall
data = self.dumpsCall(obj, method, vargs, kwargs)
File "/Library/Python/2.7/site-packages/Pyro4/util.py", line 476, in dumpsCall
return serpent.dumps((obj, method, vargs, kwargs), module_in_classname=True)
File "/Library/Python/2.7/site-packages/serpent.py", line 78, in dumps
return Serializer(indent, set_literals, module_in_classname).serialize(obj)
File "/Library/Python/2.7/site-packages/serpent.py", line 250, in serialize
self._serialize(obj, out, 0)
File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize
return self.dispatch[t](self, obj, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple
serialize(elt, out, level + 1)
File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize
return self.dispatch[t](self, obj, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 369, in ser_builtins_tuple
serialize(elt, out, level + 1)
File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize
f(self, obj, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 552, in ser_default_class
self._serialize(value, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 271, in _serialize
return self.dispatch[t](self, obj, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 431, in ser_builtins_dict
serialize(v, out, level + 1)
File "/Library/Python/2.7/site-packages/serpent.py", line 291, in _serialize
f(self, obj, out, level)
File "/Library/Python/2.7/site-packages/serpent.py", line 551, in ser_default_class
raise TypeError("don't know how to serialize class " + str(obj.__class) + ". Give it vars() or an appropriate getstate")
TypeError: don't know how to serialize class <type 'numpy.ndarray'>. Give it vars() or an appropriate getstate
python LSA/kmer_lsi.py -i ./hashed_reads/ -o ./cluster_vectors/ 2015-11-04 00:31:32,643 : INFO : using distributed version with 5 workers 2015-11-04 00:31:32,643 : INFO : updating model with new documents 2015-11-04 00:31:32,643 : INFO : initializing 5 workers 2015-11-04 00:31:33,112 : INFO : preparing a new chunk of documents Traceback (most recent call last): File "LSA/kmer_lsi.py", line 41, in
Thanks!