Open bsandeepan opened 5 years ago
Without having looked at your fork -- you can write code that works with Py2+3 simultaneously with six
and a few other tricks. If you do this, we can integrate it with pull request.
Without having looked at your fork -- you can write code that works with Py2+3 simultaneously with
six
and a few other tricks. If you do this, we can integrate it with pull request.
Did Py2+3 compatible coding before with "tricks". Okay, I will check out "six" module.
Hi @palday Few queries -
six
module but the lines of code requiring backward compatibility support is few. Also, I am proposing to support only Python 2.7, not older deprecated versions (2.6 or lower). Py2.7 will be officially discontinued from 2020. Therefore, it seems only logical to use "tricks" to write few lines of backward compatible code rather than using a large dedicated module, given that this library may only read data from and write data to EDF/EDF+ files. That, however, is my personal opinion. I might be missing some aspects here. If so, please inform me about them.Requesting for your help in these issues. Thank you. 😃
Regarding (1): I have no strong preference for how to make it compatible with both simultaneous. Py2.7 + Py3 with just a few try
blocks or the like is perfectly fine with me.
Regarding (2): I don't know off the top of my head and probably won't have time to take a closer look this week.
Why would you make an effort to support Python 2.7, instead of focusing all efforts on a good Python 3 package?
Because it's already Python 2.7 code, so it's trivial in terms of effort, and for better or worse, Python 2.7 is still around in a lot of places.
Hi @palday I made the changes to have Py2.7+3 support in EDF.py (literally one try block). Should be working fine. Kindly check and confirm. Thanks. https://github.com/bsandeepan95/pyedf
It has been 12 days and I made more changes -
The only incompatibility with EDF is, that signals may be recorded discontinuously. Therefore, we have decided that the EDF+ 'version' field must still read '0' like in EDF. In this way, old EDF viewers will still work and display EDF+ files (be they continuous or discontinuous) as continuous EDF files. EDF+ software will know the difference between continuous and discontinuous files from the mentioned 'reserved' field.
I tested the code in py2.7 and it seems to be running with no issues. Let me know your results. Then we can go ahead with pull requests.
Thanks for putting in so much effort.
This is multiple pull requests. If you separate out the addition of the doc strings (for existing functions, not for new ones) into a separate feature branch, that would be a relatively straightforward pull request you could make now.
I'm not sure how I feel about re-organizing the README.
I'm not sure changing the magic
field to file_ver
is necessary. "Magic numbers" are a concept in computer science for indicating file type (e.g. 0xDEADBEE
) and it's not really a file version -- after all EDF and EDF+, which are different not entirely compatible formats, both use a zero in that field.
I don't know how I feel about adding the test file to the repository -- even though it's small for EEG, it's big for code. An alternative is to set up the configuration for the test suite that the file is downloaded as necessary. @robertoostenveld Thoughts?
For some of the PEP8 stuff with linebreaks: you were a bit zealous with parentheses. Binary operators (such as or
) imply parentheses and having an open parenthesis right before a line break is not nice. If you need to add parentheses for something of that sort, then put it around the entire conditional.
For write_byte
: I'm not sure assuming UTF-8 is the right decision there. I think the far better thing to do would be to treat everything as bytes and not as strings and use the associated bytes methods in Python 3, instead of posthoc converting things. Your solution will probably work most of the time if you read in bytes, represent them as UTF-8 strings, and then write them back out, but it's misleading about what's actually going on and will cause some problems when you have bytes sequences that correspond to non-existent Unicode codepoints or 0x00
.
I think I can make a pull request for the docstrings but I think a better one is when it's done as expected.
I did not know the concept of magic numbers but now that I have read about it, I understand the purpose and will revert this change asap. Also, I will leave a note about it in docstring so that anyone in future do not get confused over this like me.
I included the test file for ease, but I think we can add the download link in ReadMe and ask the devs to download it or any other test file, and change the file name in code for testing (or create a new .py file for this as I did once).
I was not particularly having obsession for parentheses as much as I was trying to maintain the 80-char line limit and represent the code in a cleaner and readable way. Open Parentheses is kind of similar to having open curly braces before writing any operations in loop or conditional statements in other languages. I will see how much I can do about styling the binary operators.
about write_byte
, I do also think that ASCII conversion is the better path since EDF file format reeks of it along with the Y2K bug it has due to being a format of pre-2000 era. I used UTF-8 absentmindedly and since it did not raise any issues (I tested the newly created file with EDFBrowser and it seem identical), I was not bothered about it. So, do we only change it to ASCII chars, or have any Exception handling for possible use of Non-ASCII chars?
The docstrings are a separate issue from the other things and really should be a separate pull request. That will allow us to review/comment on the docstrings as a purely documentation issue and not on a Python 2/3 issue. There is no need to comment on the 'magic' part of the magic number, but generally adding references to the various fields in the EDF header would not be a bad idea.
Although I'm not entirely onboard with the PEP8 obsession of 79 chars ("consistency is the hobgoblin of little minds" and all that), I understand why you had parentheses there and instead wanted to suggest a different solution. The more common one is to put the whole condition in parentheses (i.e everything between if
and :
).
I wouldn't call the dates in EDF a Y2K bug because they don't suffer from the same ambiguity as say financial accounts because there were no digitally recorded EEG signals before the 1970s and it's a fair assumption that no recording in EDF format existed more than a few years before the standard was approved in 1992. Note that EDF+ addresses this in various ways: the birthdate field in the patient info uses a full 4-digit number for year and there is an additional field for fully specifying the recording year.
By design, UTF-8 will pass undetected throw systems expecting but not interpreting ASCII (such as various email systems). The problem is that Python 3's String type isn't required to use any particular coding scheme internally to represent Unicode codepoints. So now the bytes are being interpreted and converted to codepoints and then re converted later on and it's this application of 'semantics' that causes problems. The correct thing to do here is to read it as Bytes and only decode the actual text fields (which are specified to use a very small subset of ASCII related to the usual way of representing dates) from ASCII and then likewise for writing encode ASCII strings to bytes. Any UnicodeErrors
or the like I would treat as fatal.
Apologies for such delay. I was occupied in office projects. Will make a branch to add the documentation separately soon.
Hey @palday ,
I hope you are still online. Apologies for the EXTREME delay. I have removed the docstrings for the time being (They are backed up elsewhere). The PEP8 formatting is there but I am hoping it will help you to check codes with clarity. Do let me know if the latest commit is okay to merge. Afterwards, I will create a separate pull request for the docstrings.
The project link - https://github.com/bsandeepan95/pyedf
Why not move to Python 3?
As of January 1st, 2020 Python 2 is no longer supported. Most Python libraries out there have moved to Python 3 and have dropped Python 2 support.
@DimitriPapadopoulos
As pointed out in the README, there is a Python 3 version available:
Notably, that work began before Python 2 support EOL.
But as to why not make the Python 3 branch the default?
py3k
branch here and MNE-Python). I would be happy to consider pull requests to provide better Python 3 support for this package but only once there are tests. This means that potential contributors either need to wait for me to write tests and set up CI or they need to contribute tests first. (In the latter case, I would be happy to help set up CI -- that's the easy part!)
The very vast majority of Python libraries such as NumPy or SciPy have dropped support for Python 2 and moved to Python3. They cannot be collectively wrong.
Applications still in need of Python2 may use old releases of pyedf that run with Python 2. You could tag a last Python2 compatible release for that purpose, perhaps one with very basic fixes I could provide, such as PEP8 compliance. Besides, why care so much about machines that run Python 2 if you're not compensated for working on this codebase?
That said, I agree tests would be great. Is Python 2 still available for CI tests?
I'l try MNE in the short term.
I believe this package is in need of new maintainers; I have little interest in Python and EDF, and I think that @palday has moved on to mainly working on/with Julia. @DimitriPapadopoulos and @bsandeepan would you be interested?
@robertoostenveld Sure. I could help with basic maintenance: prepare a cleaned up Python 2 release, move to Python 3, fix bugs and adapt to new releases of dependencies. I probably won't have time for large changes myself, but I can review contributions.
@robertoostenveld I checked the code repo and it seems to be compatible with python 2.x. Since Python 3.x is getting more welcomed, I forked the repo and made a few edits to get it working with Python 3.x. If you have any similar plans, maybe I can collaborate with you for this project? check out - https://github.com/bsandeepan95/pyedf