Closed alex-shchetkov closed 3 years ago
I ran into an issue where I was unable to use any of the created indexes, due to a Json Parser claiming it encountered invalid chars.
This was misleading, because the actual issue was that only a portion of the index file was being read.
Changing the FileSystem.read() to a FileSystem.readFully(). This is because using .read() does not always read in the full file.
This bug fix very likely fixes these: https://github.com/microsoft/hyperspace/discussions/431 https://github.com/microsoft/hyperspace/issues/373 https://github.com/microsoft/hyperspace/issues/297#issuecomment-747502799 (point #2)
No
I compiled/packaged the code and ran it on an EMR (spark 3.1) cluster to generate a relatively large (8MB in my case) index file in an s3 location With this change I was able to use the index to run a query.
What is the context for this pull request?
I ran into an issue where I was unable to use any of the created indexes, due to a Json Parser claiming it encountered invalid chars.
This was misleading, because the actual issue was that only a portion of the index file was being read.
What changes were proposed in this pull request?
Changing the FileSystem.read() to a FileSystem.readFully(). This is because using .read() does not always read in the full file.
This bug fix very likely fixes these: https://github.com/microsoft/hyperspace/discussions/431 https://github.com/microsoft/hyperspace/issues/373 https://github.com/microsoft/hyperspace/issues/297#issuecomment-747502799 (point #2)
Does this PR introduce any user-facing change?
No
How was this patch tested?
I compiled/packaged the code and ran it on an EMR (spark 3.1) cluster to generate a relatively large (8MB in my case) index file in an s3 location With this change I was able to use the index to run a query.