imgurbot12 / pyxml

Pure python3 alternative to stdlib xml.etree with HTML support
MIT License
1 stars 1 forks source link

Resolving compatibility issues with module _compat #3

Closed elisbyberi closed 1 year ago

elisbyberi commented 1 year ago

I have started working on resolving compatibility issues, and have created a new module specifically for this purpose. Please let me know if there are any issues that need to be addressed by commenting on #1 .

imgurbot12 commented 1 year ago

@elisbyberi just as an FYI, I'm converting this project in a standard python library that I'm going to publish on pypi, since I plan on using it within my pypub as I mentioned before.

I'll move a copy of the current master branch into it's own codon branch to preserve any translations efforts in the works. Apologies for the inconvenience.

elisbyberi commented 1 year ago

@imgurbot12 No problem, I understand. Thank you for letting me know.

I will update you when the working version of Codon becomes available next week. My brother is currently working on some Python stdlib libraries for Codon. I will utilize these libraries to continue working on codon-xml. Once it's ready, you can test it in Codon.

elisbyberi commented 1 year ago

@imgurbot12 I am closing this pull request. I have initiated a new project to develop a purely Python-based implementation of the library that doesn't rely on non-pure Python libraries. This implementation will work seamlessly with all Python implementations. As the project is still in its planning phase, I cannot provide an ETA at this time.

Please note that the "pyxml" library, while potentially being a pure Python implementation of the standard "xml" Python library, relies on non-pure Python libraries and is therefore incompatible with various Python implementations. Working on this pull request diverges from the main objective, which is to make "pyxml" compatible with Codon. This is because it's not "pyxml" itself that needs to be addressed, but rather its dependencies. It is much easier to implement it from scratch using only Python built-ins.

imgurbot12 commented 1 year ago

Alright, that's fair. I'd perhaps argue with you over your definition of "pure python" considering I believe python3 typehints to be critical components of the standard library, but I understand where you're coming from. Best of luck to you.

elisbyberi commented 1 year ago

@imgurbot12 Quoting PEP 484 Non-Goals: "It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention." Therefore, no Python implementation should be obligated to support type hints.

However, the issue at hand is not related to type hints. Instead, it concerns non-pure Python libraries that are part of the standard Python distribution and are implemented using Python C extensions. This is similar to using assembly code in C, which is not portable.

Furthermore, the lack of Unicode support in Codon exacerbates this issue. The pure Python library must be encoding-agnostic in terms of string data for performance reasons. Supporting Unicode would make the code unnecessarily slower (I always use "array.array()" for string manipulation because it is encoding-agnostic and the code remains unaffected by the string's encoding.)

imgurbot12 commented 1 year ago

@elisbyberi, I understand not requiring typehints. That's good, but not supporting them is another. I would assume and hope Codon seeks to be naively compatible and hot-swap-able between normal python and Codon. To not support python's standard type hints excludes their usage from any code base which severely limits what can be implemented. I dare say the adoption of Codon will be extremely low if Codon cannot maintain parity between the two.

In terms of using C-libraries, I'm not sure what you are referring to. The two libraries that I import and install are dataclasses and typing_extensions which are simply backports of features provided in the latest version of python 3.11. They're just parts of the standard-library. The features I'm using are not cpython either. They're native python: https://github.com/python/cpython/blob/3.11/Lib/dataclasses.py https://github.com/python/typing_extensions/blob/main/src/typing_extensions.py

elisbyberi commented 1 year ago

@imgurbot12 Examples of Python standard libraries implemented using C extensions include builtins (bytes, bytearray), and io, among others. Virtually every Python standard library relies on a C extension. It is not feasible to maintain a custom partial implementation of these modules specifically for the "pyxml" library. We may consider using them when they become a part of the Codon standard library, but until then, it is preferable to implement "pyxml" at a low level using Codon's List or Array. In other words, we should leverage the features available in both Python implementations.

"It is much easier to implement it from scratch using only Python built-ins." The reason for my statement is that due to the lack of proper error handling in Codon, it is more challenging to port an existing Python codebase to Codon as compared to building a new one from scratch. When starting from scratch, it is possible to trace back and forth to identify and resolve errors. However, when working with an existing codebase, it can be challenging to understand what is happening. Additionally, Codon currently lacks proper IDE support.

Codon is not intended to replace CPython. Furthermore, as Codon is still in development, CPython is also evolving rapidly.

imgurbot12 commented 1 year ago

I'm sorry I'm a bit confused. I could understand an argument about performance maybe to instead write this using whatever underlying methods exist within codon, but if you don't intend to support builtin python types and functions then many of the projects taglines just seem blatantly false then: A high-performance, zero-overhead, extensible Python compiler using LLVM Codon is a Python-compatible language both of these statements are simply not true if Codon cannot support normal python types and language builtins.

I assumed the intent was to enable performance improvements using static-typing, but if the static-typing is required to be different it's not longer python it's just python-like, even more so if things like bytes and bytearray aren't supported. bytes especially are used all over python3. It's one of the largest additions to python3 that caused a split between python2 and 3 in the way that they managed and handled strings. python2 has been deprecated for 3 years at this time. No one is coding in python2 anymore outside of very rare legacy exceptions. To not include python3 standard features is insanity.

elisbyberi commented 1 year ago

@imgurbot12 The philosophy of Codon is to optimize slow Python code by compiling it down to machine code, while also offering extensive support for CPython. It is not intended to replace CPython but rather to complement it. Codon allows for seamless interaction between CPython and itself, and it is also possible to create Python C extensions using Codon.

Codon 0.y.z is still a work in progress, which means that there are things missing and that changes may occur. It is not feasible to implement partial libraries exclusively for the "pyxml" library, as these would be difficult to maintain while Codon is still under development.

In the case of implementing "xml" in Codon, it is not a priority because it already makes use of the pyexpat - Fast XML parsing using Expat library, which is implemented in C. Therefore, it is not of significant interest to the development of Codon at this time. Remember that it took Python 32 years to come at this point. Codon is just 1 year old.

Instead, I plan to create a version of "pyxml" in the "codon" branch that is compatible with Codon. This will serve as a test case and a point of interest for Codon, but it is not intended to be added to the Codon standard library, as it requires heavy optimization.