Closed pilarcormo closed 5 years ago
Hi Pilar
Thanks for reporting this. It's a strange error and I am not exactly sure why you might get this, but will make sure it is resolved for you so clust runs successfully for you.
Two questions:
clust dataset_file
Best wishes Basel
Hi Basel,
Thanks so much for getting back to me so quickly.
clust dataset_file -r replicates-file.txt -o output_file -n 101 4
Hi Pilar
I know why you are getting this error :)
Your dataset has a single sample only (one column of TPM values). Clustering is not really applicable in principle to one-sample datasets as a single sample does not suffice to make patterns or profiles of gene expression. This error is also explained here:
https://github.com/BaselAbujamous/clust/issues/14
If you have multiple samples or you need further assistance in designing your experiment please don't hesitate to let me know the details :)
All the best Basel
Hi Basel,
Thanks so much. That makes sense. I'll change the structure of my data and try again.
Pilar
Hi again,
I changed my input files, now every file has between 5 and 12 samples and I'm getting exactly the same error message. Any ideas of something else I should change?
Thanks
Pilar
So each file has the names of the genes in the first column, followed by 5 to 12 columns for the 5 to 12 samples, and the first row of the file is a header with the titles of the columns? Then you ran clust as:
clust dataset_file -r replicates-file.txt -o output_file -n 101 4
Correct?
So I'm running:
clust dataset_folder -r replicates-file.txt -o output_file -n 101 3 4
dataset_folder is where the 3 files with the samples' tpm values are. I added the -n 3 because I have RNA-seq TPM data
I can't see why the same error would appear in this case if that folder only has those three dataset files. If you like to send the data files or one of them confidentially to my email basel.abujamous@plants.ox.ac.uk I can check why this would have happened. If you like you can replace gene names with any other anonymous labels if you are concerned about protecting the confidentiality of the data.
Otherwise, if you post the first few lines from each file here I may be able to detect the cause of the problem.
Again, I would like to help until clust is running successfully for you.
BW Basel
Hi,
I'm using clust for the first time. Using tpm values to build my clusters. I'm using Python 2.7.14 and I'm running it in HPC. I get this error:
| Analysis started at: Tuesday 15 January 2019 (09:52:58) | | 1. Reading dataset(s) | | 2. Data pre-processing | Traceback (most recent call last): File "/nbi/Research-Groups/JIC/Diane-Saunders/Anaconda/Installation/bin/clust", line 11, in
sys.exit(main())
File "/Anaconda/Installation/lib/python2.7/site-packages/clust/main.py", line 98, in main
args.cs, args.np, args.optimisation, args.q3s, args.basemethods, args.deterministic)
File "/Anaconda/Installation/lib/python2.7/site-packages/clust/clustpipeline.py", line 102, in clustpipeline
filteringtype=filteringtype, filterflat=filflat, params=None, datafiles=datafiles)
File "/Anaconda/Installation/lib/python2.7/site-packages/clust/scripts/preprocess_data.py", line 630, in preprocess
Xproc[l] = fixnans(Xproc[l])
File "/Anaconda/Installation/lib/python2.7/site-packages/clust/scripts/preprocess_data.py", line 70, in fixnans
sumnans = sum(isnan(Xinloc[i]))
TypeError: 'bool' object is not iterable
Any ideas why this might be?
Thanks