Closed jhpoelen closed 1 year ago
@jhpoelen - I was able to toggle this behavior client side by removing or including the Accept-Encoding header from the request.
Let me know if you can replicate on your end.
For some reason, curl with and without Accept-Encoding header did not make a difference:
$ curl --silent -H "Accept-Encoding: gzip" "https://biokic6.rc.asu.edu/preston/gbpln/4f/06/4f06230f7d9d902ea67708a0e4eb1e5c8120a1f7b30d77260832ad5803a56e17" | gunzip | head -n2
GBPLN370.SEQ Genetic Sequence Data Bank
June 15 2023
and
$ curl --silent "https://biokic6.rc.asu.edu/preston/gbpln/4f/06/4f06230f7d9d902ea67708a0e4eb1e5c8120a1f7b30d77260832ad5803a56e17" | gunzip | head -n2
GBPLN370.SEQ Genetic Sequence Data Bank
June 15 2023
however, I did notice https://github.com/spring-cloud/spring-cloud-netflix/pull/1591/commits/537792c130e8b9b29085de140ed61afdff2934bb , and found that disabling httpclient on my end was the likely culprit by automatically decompressing gzip content by default.
before disabling automated decompression -
preston cat\
--no-cache\
--remote https://biokic6.rc.asu.edu/preston/gbpln\
hash://sha256/90346c5616571af8fbacdd8449b5f04197c227ff4e8250443f1e76f649c21ec8\
| head -n2
yielded
GBPLN987.SEQ Genetic Sequence Data Bank
June 15 2023
after disabling default content decompression:
preston cat\
--no-cache\
--remote https://biokic6.rc.asu.edu/preston/gbpln\
hash://sha256/90346c5616571af8fbacdd8449b5f04197c227ff4e8250443f1e76f649c21ec8\
| gunzip\
| head -n2
yielded the following after including gunzip in the pipe:
Genetic Sequence Data Bank
June 15 2023
I am curious what you see if you run:
$ curl --silent -H "Accept-Encoding: deflate" "https://biokic6.rc.asu.edu/preston/gbpln/4f/06/4f06230f7d9d902ea67708a0e4eb1e5c8120a1f7b30d77260832ad5803a56e17" | head -n2
$ curl --silent -H "Accept-Encoding: deflate" "https://biokic6.rc.asu.edu/preston/gbpln/4f/06/4f06230f7d9d902ea67708a0e4eb1e5c8120a1f7b30d77260832ad5803a56e17" | head -n2
$O�d�]Ks���W t;B�TeU>�:�A[������=) ���.HzF��7�@SТKS}�H� �F�CuV�??z��e�pxr�����ϋ��zy6;Y���bu��=��g���f���f��E�A�t�w��Ǐ���w���~���~Z^,fNj���j1�ð�����������w������Cqv�>[��a���������������r��^�Ϯ�������W�ߜ�^��X@b��/g ?���������O�^>;}���l�t�9�|��qs�~�Y̮n�]]���~�Z\��WW���.>ݟ�}P��+}��Y~zx���㣓�z�;l�8:�}����ɣ��^���h�~�Y����������%���&�%'���N�8zH9!`��;:8^��g7g���G��I�$
�?�����q�����5yg�;k�Ys�>�sM���<���x���<���x���<��G�xd�|��;߅��|^���0�~wִ�杵4������w��²�G�xv����)�1��Zv�e�>�>�b(%+��Yo�C����~=���xu����Ã�Wo�mo�o��w�����o.��WT��~���ɋw�/�����ͧ���o�,7���1���o��k�I�����?�>���n9���>,n��X^.6�w�z1�Z,חW�s�������z~����g
�{�^���Vg����zy^/8�,�K��?�����.Ww˛���G?�����ߜ>}ul[}r�a}����WpW�-6���se��w�������N�oƓ�FY���������r����E��ˇ��n���=������g�^�����x6W�w��}ci��okۛ�ٕ��/��?�o���.�>覭�+�}u:��6�כ�buuu�1�<SB{�C�L��C����\Z�<���So��t��ՋG/O�%?o������ߒ��=�!��|N7��?T���ɍ>��ӛ���fg�¼9����������bvz����?g��˅ )���K���o�[������G�˭�8<�����㣓�+g��gu��������n��VW뛍~���xxx',���כ������{?=������x{��r��^����8n]��|{u�8[.���77>��_��o����^�#(��?K����۫�|�o�ϫ�
��𰤔�w1��Ѹ�ۋ������ҸToy��oV����\�����}�~�v�g%9�@�}l�
�J�ȽPs��J��C�[j����
So that returned the raw compressed file - I'm guessing that curl has "Accept-Encoding: gzip,..." as a default unless it is explicitly overridden.
After deployment of preston v0.6.5, the following expected behavior is observed:
preston cat --no-progress --no-cache --remote https://biokic6.rc.asu.edu/preston/gbpln hash://sha256/90346c5616571af8fbacdd8449b5f04197c227ff4e8250443f1e76f649c21ec8 | gunzip | head -n2
GBPLN987.SEQ Genetic Sequence Data Bank
June 15 2023
and
preston cat --no-progress --no-cache --remote https://linker.bio hash://sha256/90346c5616571af8fbacdd8449b5f04197c227ff4e8250443f1e76f649c21ec8 | gunzip | head -n2
GBPLN987.SEQ Genetic Sequence Data Bank
June 15 2023
where linker.bio is proxying the BioKIC server using Preston v0.6.5 .
@GregPost-ASU thanks for being patient in helping to troubleshoot this funky issue.
I still don't quite understand why I wasn't able to reproduce the curl commands. Am closing issue for now, as the desired behavior is observed after changes were applied.
Please do feel free to comment /re-open if you feel more work is needed.
steps to reproduce:
expected: the content is served as is, no decompression / compression
actual: for some reason, apache server config automagically detects gzip files and decompresses them
fyi @GregPost-ASU
example:
yields plain text -
and we know that hash://sha256/90346c5616571af8fbacdd8449b5f04197c227ff4e8250443f1e76f649c21ec8 is a gzipped file.