Enhance String ValStore Python speed

Issue #, if available: N/A

Description of changes: Accelerate String ValueStore batch get.

For 66M rows, 10 cols string store, batch get of sub-matrix time cost is reduced by 60%:

Also for Float32 ValueStore, return Numpy view instead of Python MemoryView to connect with Torch tensor.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

amzn / pecos