KrishnaswamyLab / SAUCIE

Other
98 stars 29 forks source link

error during batch correction #10

Closed fedeago closed 5 years ago

fedeago commented 5 years ago

Goodmorning, I am traing to run a batch correction. In the input directory there are 2 csv file where the first column is the index of the rows and the firs row is the index of the columns. The column's index that represent genes is the same for both datatset.

python SAUCIE.py --input_dir input --output_dir output --batch_correct

Training batch correction models. Traceback (most recent call last): File "SAUCIE.py", line 338, in train_batch_correction(rawfiles) File "SAUCIE.py", line 128, in train_batch_correction raise(ex) File "SAUCIE.py", line 103, in train_batch_correction refx = get_data(ref) File "SAUCIE.py", line 74, in get_data newvals = asinh(x) File "C:\Users\Federico\Desktop\Tesi\Elaborazione\Clustering SAUCIE\utils.py", line 8, in asinh return f(x) File "C:\Users\Federico\Anaconda3\envs\Saucie\lib\site-packages\numpy\lib\function_base.py", line 2739, in call return self._vectorize_call(func=func, args=vargs) File "C:\Users\Federico\Anaconda3\envs\Saucie\lib\site-packages\numpy\lib\function_base.py", line 2809, in _vectorize_call ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args) File "C:\Users\Federico\Anaconda3\envs\Saucie\lib\site-packages\numpy\lib\function_base.py", line 2769, in _get_ufunc_and_otypes outputs = func(*inputs) File "C:\Users\Federico\Desktop\Tesi\Elaborazione\Clustering SAUCIE\utils.py", line 7, in f = np.vectorize(lambda y: math.asinh(y / scale)) TypeError: unsupported operand type(s) for /: 'str' and 'float'

Do you know how can i fix it?

If I try to do batch correction with jupyter notebook, using a dataset made by the union of the previous two, and a second numpy object that indicates the batches. i recive this other error:

saucie.train(load, steps=100,batch_size=256)

ValueError Traceback (most recent call last)

in ----> 1 saucie.train(load, steps=10) ~\Desktop\Tesi\Elaborazione\Clustering SAUCIE\model.py in train(self, load, steps, batch_size) 411 ops = [obn('train_op')] 412 --> 413 self.sess.run(ops, feed_dict=feed) 414 415 def get_loss(self, load, batch_size=256): c:\users\federico\anaconda3\envs\saucie\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 893 try: 894 result = self._run(None, fetches, feed_dict, options_ptr, --> 895 run_metadata_ptr) 896 if run_metadata: 897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) c:\users\federico\anaconda3\envs\saucie\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1098 'Cannot feed value of shape %r for Tensor %r, ' 1099 'which has shape %r' -> 1100 % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) 1101 if not self.graph.is_feedable(subfeed_t): 1102 raise ValueError('Tensor %s may not be fed.' % subfeed_t) ValueError: Cannot feed value of shape (256, 1) for Tensor 'batches:0', which has shape '(?,)' Do you know how can i fix it?
mattamodio commented 5 years ago

The first error is likely due to the presence of the column/row indexes, as it is trying to use the gene names as values.

The second error probably stems from feeding the batch labels as a 2-dimensional vector with the second dimension being 1, instead of it being a 1-dimensional vector. You can fix this by reshaping the batch label vector prior to feeding it in.