Open JonoYang opened 4 years ago
We use os.path.getsize
to get the size of a file. I ran os.path.getsize('configure.bat')
on Windows and Ubuntu to see if we get a difference in size. On Windows, configure.bat
is 4185 bytes and on Ubuntu, it is 4064 bytes. The size disrepency is due to the line ending differences between Windows and Linux. Text files on Windows end in /r/n
, rather than just /n
as in Linux. So for every line in a text file on windows, there is an extra byte.
On a freshly cloned repo, typecode/tests/typecode/data/contenttype/size/dir/a.txt
is 2 bytes in Ubuntu and 3 bytes in Windows.
@JonoYang this difference would be quite a surprise!
I did a quick check and the size is the same for me, given the exact same file (sha1-wise) as an input. IMHO you must have by tripped by using different checkouts or branches on each OSes :)
@pombredanne so configure.bat
is 4064 bytes on windows and ubuntu for you?
On a freshly cloned typecode repo:
C:\Users\Jono\Desktop\typecode-new-new>python
Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.getsize('configure.bat')
4185L
>>> len(open('configure.bat').read())
4064
>>>
I think github is cloning the repo and adding the crlf endings automatically to text files on Windows. The same thing is happening on the azure ci wrt TestContentTypeComplex.test_size
@pombredanne Got around the size detection difference by modifying the windows azure pipeline job template to run git config --global core.autocrlf false
before checking out the repo so it doesn't replace the line endings when checking out files. I also had to add a .gitattributes file that sets configure.bat eol=crlf
. I found out that the script did not run properly when windows runs a batch script that has LF line endings rather than CRLF.
There are still filetype/mimetype detection differences remaining.
Some of the tests have different results on Windows:
In the case of
test_filetest_code_java_logger_56
, different filetype, mimetypes, and file size were detected:Expected result for
test_filetest_code_java_logger_56
:Result for
test_filetest_code_java_logger_56
:The detected types and size should be the same. The other failing tests have similar issues.