BaselAbujamous / clust

Automatic and optimised consensus clustering of one or more heterogeneous datasets
Other
161 stars 36 forks source link

Add initial eigengene computation support #20

Closed apcamargo closed 4 years ago

apcamargo commented 5 years ago

Hi @BaselAbujamous

This PR adds initial support for eigengene computation. Is everything alright?

It still doesn't support multiple datasets/species. How do you think this should be implemented?

taylorreiter commented 5 years ago

When I tried running clust from the branch in this PR, I got the following error message:

/Users/tr/miniconda3/envs/cluster/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/semlock.py:217: RuntimeWarning: semaphore are broken on OSX, release might increase its maximal value
apcamargo commented 5 years ago

Hi @taylorreiter

From what I've read, this issue is inherent to macOS (even though it isn't reported in Python 3). As I worked in this PR when Clust was still written in Python 2, it's expected that you get this warning.

Did you get the output file (despite the warning)?

taylorreiter commented 5 years ago

I got the typical clust outputs, but not the eigengene calculations

.
├── Clusters_Objects.tsv
├── Clusters_profiles.pdf
├── Input_files_and_params
│   ├── Data
│   │   └── FAL-all-counts-filt.csv
│   ├── Replicates.txt
│   └── input_params.tsv
├── Normalisation_actual.txt
├── Processed_Data
│   └── FAL-all-counts-filt.csv_processed.tsv
├── Summary.tsv
├── log.txt
└── tmp.txt

3 directories, 10 files
apcamargo commented 5 years ago

I think I can take a look at this later. I'll merge the master branch into this to make it compatible with Python 3.

In the meantime, have you tried to execute Clust without multithreading (-np 1)?

apcamargo commented 5 years ago

@taylorreiter I've merged the master into this branch and I could successfully run Clust with Python 3 and get the Eigengenes.tsv file. Try to pull my latest changes and execute clust using python3 clust.py. I hope it works!

If you still get a warning and a Eigengenes.tsv isn't generated, I don't think the problem is in my branch. Maybe its something macOS-specific. In this case, I think the best thing to do is reporting the issue to @BaselAbujamous .

apcamargo commented 5 years ago

Hey @taylorreiter ! Did you try using this branch again? I think I'll have some time to work on this branch during this week, in case you still have any problems with it.

taylorreiter commented 5 years ago

hi @apcamargo! I ended up implementing it to post-process the output of clust instead of having it perform the computation while running clust.