Closed MontrealSergiy closed 5 years ago
Thanks for the issue.
I'm not sure there's anything wrong with the current explanation. A zip file is not the same as a plain text file. Certainly the key points are that these docx files don't mean anything to commands like head
, and as such you shouldn't use them to write shell scripts. Try this:
Create a test .docx file,
then run head
on it.
Downloads $ head test.docx
PK���N
_rels/.rels��MKA
���C��l+����"Bo"�������3i���A
�P��Ǽy���m���N���AêiAq0Ѻ0jx�=/`�/�W>��J�\*�ބ�aI���L��41q��!fOR�<b"���qݶ��2��1��j�[���H�76z�$�&f^�\��8.Nyd�`�y�q�j4�
x]h�{�8
��S4G�A�y�Y8X���(�[Fw�i4o|˼�l�^��͢����P��#�=PK���NdocProps/app.xml���n� ��}
�C@1KK�"��Q��˘'�&g�B/1��P��ONz�Ho+(��=��\��#�����/�Wg�x�W��NN��ʡ�*Ӄ`�x����(�i#bc>��T�"g�]>�l�=���eQ�l���@!e�g�����t��Ƿ��(f�^�sAG���:��V�0�����Y����IL�h|��eTi��oݠ�V-�P��P"PK���NdocProps/core.xml�R]O�0}�W,}ߺ
O�R�-�9#i�2��V%�AL�ґ�Jsz�W��^������P��I+�e����¿E��D2RV2t�f�UJUB+
����0�̙Rp{PpQړ�zo� ��:�G�������s;��e�*%�X�C_
;�G����b������dH5�,��Kw�56?8��$'����z�vq���=30Tse�
;���<n����l?>��n���r[B���?�1�P��:�c�PK���Nword/_rels/document.xml.rels��M
�0���"�ަU��nDp+�1���6 �(z{�Z(����}�1/__��]�m��,I��Q�Ҧp(��%��I��NR\ �v���������������?@��������I��wP�/0��PK���Nword/settings.xmlEO�N�@��>.?��|m
����tC���T\z�F�7qӕv���4��c��C3��=��BU��w��I�������#4j�3&a��v�M�vJf���X�u��Y�B��Lu#�ص�Ԍ��.a�:�*�z4��mۇ�12�^�)���+T'b�9m
�[���o�x|1)n�`�~�'O�
�؇�v�+h/l���5՞4�/;�9:
?����
P�|�>�PK���Nword/fontTable.xml�PAN�0��
�w�����U%� �@��d�#�I��q�DB�B�f�����r����0��T����I���S%�����Rp:����<#��:��o�V�� �o8�އc�F����ZŃr`H2�72����'��R����ܘ��jH'�����{���{��P&���Ӂ�dQH���{�!�3К��q�A0p�x���������Nz-n��I�i�ɳ�7��z1
�l��`��
6n:�|�[M%�ߺ��ɀx*ص��3<x� P��J�UPK���Nword/document.xml�T�n�0��+�mIM�B�\�=40`�h���r�pd����-
`����[F�_�NN�r�d�:c���U�6%�u���gI@n+���%;��7_��r�3�bB
6�d�"�VVF p�ո�����A�peg�<�"p��;�5�����[#�\���ƚl�ӻ�_���۷�˅�����OlF\��p���81OO�����%�b*�����3�?�x~�(~4�?jD�DwlCw���s\��k�e�g-��8q]���Z�4"
�e���9�*$�� ʀ¡`h�EA
��~V��w����o�ӟ���|����w�7�S�O���5Ƣ��X�i_m[�+I�6�t~Aj�pF�љl:�Q��3��jm���B�Ӊ�c�5�a
�����iT�����p
s��5�4FZY�S(b���Q���
9�J�9��M����/Pj��X�*PK���Nword/styles.xmlŕaO�0���WD�^b�"b놺i�pu.��c{�C(�~��tmӎ�1�K�׳�����]<U<zDm�)9<HH��ʜ�iJ���
NId,����rq���;�h"�/̰IIi�Ʊ�%V`�B�b��X��Ӹ�:WZR4�m_��(IN�
� �6�ǽ�*F�4��TV�,
What you get is mostly not ASCII characters, and certainly isn't the first ten lines of what you wrote.
Thanks again for raising the issue, and do let me know if I've missed something.
I agree with @gcapes here, for the purposes of this lesson, the description is accurate. The tooks we are covering can't handle the file as text.
Thanks for your suggestions!
I argue with a statement that formatting details such as font sizes are NOT encoded as characters, while they are exactly encoded as characters even though the whole Docx aka Office Open XML document is additionally encoded / compressed into zip format. In fact some advanced text editors such as vim can edit zipped text document. Head, cat, tail, sed and such can be used too with aid of zip/unzip filters or pipes .
Lesson says that ".docx files to store not only text, but also formatting information about fonts, headings, and so on. This extra information isn’t stored as characters, and doesn’t mean anything to tools like head"
In fact it is not completely accurate. Docx files contain said formatting info as characters, yet files are zipped. Docx files are xml, zipped fo take less space. I think it is better to use executable binaries, photo, video or audio.