beeware / voc

A transpiler that converts Python code into Java bytecode
http://beeware.org/voc
BSD 3-Clause "New" or "Revised" License
869 stars 518 forks source link

GSoC 2017 proposal : Complete the implementation of FrozenSet, Set and Str builtin data types. Partial implementation of JSON module #487

Closed ASP1234 closed 7 years ago

ASP1234 commented 7 years ago

Project description

Abstract/summary:

VOC is a transpiler that converts Python code into Java class files so Python code can run in the JVM. This enables you to run Python code anywhere that there is a JVM - which includes Android phones, web applications to deploy in a J2EE container.

In order to replicate Python behavior in Java, Python's logic for all the basic operations needs to be implemented in the VOC Java support library.

Describe the need your project fulfills:

According to Effective Java, one of the idioms of a good software is to be Immutable whenever possible :

A frozenset is an immutable set - it's set up with the frozenset() function call, but once it's set up its contents cannot be altered. Operations which are non-intrusive (i.e. read only) work on a frozenset in the same way that they work on sets, but of course anything that writes to / modifies a set can't be used on a frozenset.

One of the most widely used data types is a string. A string consists of one or more characters, which can include letters, numbers, or other types of characters. So, a complete implementation of String datatype is crucial for this project.

JSON is JavaScript Object Notation. It is a much-more compact way of transmitting sets of data across network connections as compared to XML and thus, widely preferred and used.

How will your project meet this need:

I have sub grouped each of the operations into two categories :

 - Set Operations:

This includes add, clear, copy, difference, difference_update, discard, remove, intersection, isdisjoint, issubset, issuperset, pop. Java Collections can be used here. The main challenge is to deal with immutability in the implementation of frozenset operations. Also, the sweetness lies in the fact that Set and FrozenSet resemble in many ways. So, I'll be tackling both simultaneously.

 - Built-in Functions:

The Python interpreter has a number of functions built into it that are always available. Output of Python code doing that operation should be the same as the one run through CPython.

I have sub grouped each of the operations into two categories :

 - Complex Operations:

These operations don't have any naive Java implementation or require additional logic. This includes translate, maketrans, format, format_map.

 - Java Support Functions:

This includes built-in functions, unicode operations, splitlines.

The two major operations of JSON:

 - Encode:

Encodes the Python object into a JSON string representation.

 - Decode:

Decodes a JSON-endoded string into a Python object.

For this, we have JSON.Simple Library available to parse the JSON Strings in Java. But then this means introducing an external dependency into the project. So, either we need to implement a parser of our own or can rely on JSON.Simple.

Timeline/milestones:

Note: The whole process will be accomplished in stages, so that rebasing and merging doesn't pose any difficulties.

May 4 2017 - May 30, 2017 [Community Bonding]
May 30, 2017 - June 02, 2017
June 05, 2017 - June 09, 2017
June 12, 2017 - June 16, 2017
June 19, 2017 - June 23, 2017
June 26, 2017 - June 30, 2017 [Phase 1 Evaluation]
July 3, 2017 - July 7, 2017
July 10, 2017 - July 14, 2017
July 17, 2017 - July 21, 2017
July 24, 2017 - July 28, 2017 [Phase 2 Evaluation]
July 31, 2017 - Aug 4, 2017
Aug 7, 2017 - Aug 11, 2017
Aug 14, 2017 - Aug 18, 2017
Aug 21, 2017 - Aug 29, 2017

What broader goal is your project working towards?

Making VOC more genial and accessible to new and existing users and expand its use cases.

What resources will you need: people, documentation, literature, sample data, hardware if applicable:

Setup

Context

I'd be elated if the proposal could be accepted, as it could allow me to make more advanced contributions to the open source. The idea of contributing to the open source and helping the non - technical people through coding and problem - solving skills is exciting.

Ongoing involvement

I intend to regularly contribute to the VOC codebase and be involved in the gitter channel. I have these contributions to VOC repository. I will continue contributing to VOC codebase even after the GSOC 2017. I intend to stay around and be a active member for Pybee after the summer is over.

Commitment

I understand that this is a serious commitment. I don't have any other major commitments and I will devote ~40 hours a week during the entire tenure.

freakboy3742 commented 7 years ago

Thanks for submitting this proposal. I have two major comments:

  1. I'm not convinced you've done enough exploration of the amount of work involved for each method. For example, in the week of June 26, you're proposing to implement str.isidentifier, str.isnumeric, and str.isprintable - three relatively straightforward boolean checks. The following week, you're proposing to implement str.casefold, str.splitlines, str.fromat, and str.format_map - four significantly more complex methods. A similar problem is evident in the weeks of June 5 and June 12 - FrozenSet and Set will be mostly identical in implementation - yet you've allocated the same amount of time to implement the same two methods on both classes. The timeline you've proposed feels like you've determined what you want to do, and then split it up into 12 weeks, rather than working out how much time each task will take.

  2. You've suggested using JSON.Simple as a library. That's fine - there's no need to re-invent the JSON parsing wheel. But why is JSON.Simple the right library to use? What alternatives have you investigated?

freakboy3742 commented 7 years ago

Unfortunately, this proposal wasn't accepted.