Open chasemc opened 2 weeks ago
To be clear -> as written this would only happen in the instance that there are less "samples" (contigs) than there are PCA dimensions
What would the point be of doing PCA on a dataset with less than 50 contigs before some other dimension reduction technique? I think before making this change there should be some data gathered on whether it is useful or makes a difference.
The main reason is so a minimal dataset that doesn't take forever doesn't fail when testing the workflows.
Would it be okay to switch:
https://github.com/KwanLab/Autometa/blob/0d9028cf7bad20d6e28667aaba9d3889a15ace09/autometa/common/kmers.py#L601-L607
to adapt to a lower pca dimension when there aren't enough contigs/kmers