Closed driskell closed 1 year ago
@driskell thanks for submitting your bug report. From what I understand the stream wrapper you are using seems to incorrectly report its stream capabilities. If I am correct how do you expect the CSV package to behave 🤔 . If a strean wrapper does not follow the specification I don't see how the package could prevent the issue you are experiencing.
Also I see you are referring to AWS stream capabilities. Did you look into their documentations to see if you can not change it's behaviour via some specific settings 🤔
If I recall some of AWS stream are not seekable by default, you need to activate the bahviour according to their docs.
@nyamsprod It does seem this is less a case of “stream reporting seekability incorrectly” but actually a case of “PHP does not support seekability flag for a custom stream”. There is actually no way for a custom stream wrapper to report that it is unseekable.
I think the key thing here though is still the fact that there are several calls to file system functions without any check on the error response. If these indeed point to remote filesystems then you’ll have an issue. For example even if a stream is seekable and you seek forward and the network connection drops it will fail. Likewise if you have a seek backwards and the stream is using ephemeral storage to cache and the data is lost it will fail too (of course these are hypothetical - but that’s why error checking is necessary)
Would be great to include error checking as it will save developer time on investigation
It would be great if the library could however support unseekable streams. It seems there is no reason to seek for any purpose. Even BOM can just be stripped after read, rather than stripping before a read. Any rewind seeks can also be checked - if you’re at position 0 already no need to seek during rewind. Would greatly improve things as the library could then manage seekable streams effortlessly
@driskell I added the following fix:
https://github.com/thephpleague/csv/commit/78e70d9a51e016e1a07387ea2389483b396e565e
Stream
object it will throw a RuntimeException
if the rewind
action failsfseek
fails (returns -1
) a new RuntimeException
will be thrown too. I beleive that's the only thing I can do to help with the issue.
The fix will be release in the next patch version or minor version.
@driskell I added the following fix:
- to the internal
Stream
object it will throw aRuntimeException
if therewind
action fails- if
fseek
fails (returns-1
) a newRuntimeException
will be thrown too.I beleive that's the only thing I can do to help with the issue.
The fix will be release in the next patch version or minor version.
Thanks that would be perfect! It will definitely help to raise problems with incompatible streams sooner - thank you and really appreciate your time looking after this library 👍
Bug Report
Summary
It seems some custom wrapper streams in PHP report as seekable even though they are not (fseek returns false) - I am beginning to wonder that PHP does not support a custom wrapper reporting itself as not seekable. So the metadata flag for seekable is not enough.
However, rewind and fseek do have return values. But code is ignoring them. So the failure to rewind is completely ignored. This causes corrupt loads. This can happen even with a stream that reports seekable if that stream encounters an error
Standalone code, or other way to reproduce the problem
I need some time to produce an example. But it seems AWS S3 stream wrapper is an example here. Corrupt data when no "seekable" set.
Expected result
Error at least
Actual result
No error
Checks before submitting