pulp / pulpcore

Pulp 3 pulpcore package https://pypi.org/project/pulpcore/
GNU General Public License v2.0
301 stars 116 forks source link

pulpcore-content is sending corrupted content when REDIRECT_TO_OBJECT_STORAGE=false #3336

Open dkliban opened 2 years ago

dkliban commented 2 years ago

The test_download_policy tests fail intermittently for the 'streamed' tests in CI. These tests assure that pulp works with a storage backend such as sftp server. It's possible we just need to add asyncio.shield() around here[0].

[0] https://github.com/pulp/pulpcore/blob/main/pulpcore/responses.py#L154

MichalPysik commented 1 year ago

I have collected some statistics by running the tests while using my oci-env SFTP storage profile, the individual runs are numbered, and there are always 3 test cases for every run (unless the previous one had to be interrupted by a signal since it was stuck) - if only some cases are listed for a specific run, it means that the others have passed. Please note that in the nightly CI, there is usually just the file corruption error (file1.body != file2.body), while here there are very commonly getaway timeout errors (504) which may be a direct cause of the corruption error, or they may be some other bug specific to the oci-env profile itself.

  1. ALL PASSED

  2. on_demand STUCK - ctrl + c

  3. on_demand failed - aiohttp except Response payload is not completed - SFTPError("Garbage packet received")

  4. immediate failed - error 502 at line 138 download_file - SFTPError("Garbage packet received") on_demand failed - error 504 - SFTPError("Garbage packet received") streamed failed - assert 504 == 404 (http) - no traceback

  5. on_demand STUCK - ctrl + c

  6. on_demand STUCK - ctrl + c

  7. ALL PASSED

  8. ALL PASSED

  9. ALL PASSED

  10. on_demand STUCK - ctrl + c

  11. ALL PASSED

  12. immediate failed - error 504 timeout - no traceback on_demand failed - assert body1 == body2 failed - no traceback

  13. ALL PASSED

  14. immediate - 504 timeout - no traceback on_demand - assert body1 == body2 failed - no traceback

  15. on_demand STUCK - ctrl + c

  16. immediate - failed line 153 response payload not completed - no traceback on_demand - failed line 138 response payload not completed - no traceback streamed - line 108 assert failed 504 == 404 is false - no traceback

  17. on_demand stuck - ctrl + c

  18. ALL PASSED

  19. ALL PASSED

  20. ALL PASSED

  21. on_demand failed - body1 == body2 false - ? failed to get journal cursor ?

  22. on_demand failed - line 173 response payload not completed - no traceback streamed failed - assert 404 == 504 failed - no traceback

  23. immediate - line 128 error 504 timeout - no traceback on_demand - assert 504 == 404 failed - no traceback streamed - assert 504 == 404 failed - no traceback

  24. on_demand failed - line 153 error 504 timeout - no traceback

  25. immediate - line 128 error 504 timeout - no traceback on_demand - assert 504 == 404 failed - no traceback streamed - line 138 error 504 timeout - no traceback

  26. immediate - line 128 error 504 timeout - no traceback on_demand - line 128 error 504 timeout - no traceback streamed - assert 504 == 404 failed - no traceback

  27. immediate - assert 504 == 404 failed - no traceback on_demand - assert 504 == 404 failed - no traceback streamed - assert 504 == 404 failed - no traceback

  28. immediate - assert 504 == 404 failed - no traceback on_demand - assert 504 == 404 failed - no traceback

ipanova commented 1 year ago

So far we've been testing with SFTP storage and seeing failures there. And it is not clear whether the issue is on Pulp side or maybe SFTP itself. ArtifactResponse is used to stream the data when REDIRECT_TO_OBJECT_STORAGE is set to False with object storage too. Can this be tested( for the sake of narrowing down the the issue only) with some other storage, like s3 or azure and see whether the issue is persistent there too? For example, run the tests in the setup where you have pulp_file+pulpcore with REDIRECT_TO_OBJECT_STORAGE =False and s3 storage.