improve guidance on what technique to use to make similarity matrix

The current "Advanced usage" section in the README makes it sound like there's almost no use case for passing in a similarity function. Only when the matrix doesn't fit on the hard drive?-- But people have many-terabyte drives now.

I would favor using a similarity function over a cvs file for a large dataset. It would be much faster (and we could be talking about weeks or months of compute time) because: