carlkma / free-text

MIT License
0 stars 0 forks source link

Identify missing colon #1

Open carlkma opened 2 years ago

carlkma commented 2 years ago

return the exact position(s) where a colon is missing

carlkma commented 2 years ago

Solution Overview

Colons serve three (3) primary purposes in Python:

  1. a slice operator, e.g., sub_string = a_string[3:9]
  2. a signal for the start of indented blocks, e.g., after for i in range(10):
  3. a designation for key:value pairs in a dictionary, e.g., a_dict = {"a_key": "a_value"}

Different techniques shall be used to identify missing colons in each of the three cases.

Case 1: Colon as a Slice Operator

Several characteristics of colons used in this case include:

Intuitively, it is difficult to identify a missing colon intended to serve as a slice operator. For instance, there is no way to infer from a_string[39] that the user wants to get a substring sliced from index [3:9], but not the single character positioned at the 39th index.

Some ideas for heuristics:

A pseudocode implementation is given below:

int missing_Colon_for_Slicing(line)
    MAX_ALLOW_INDEX = 50
    index_string = re.search(r"\[(\d+)\]", line)
    index = int(index_string)
    if (index > MAX_ALLOW_INDEX)
        return line.find(index_string)
    else
        return -1

Case 2: Colon as a Signal for Indentation

Several characteristics of colons used in this case include:

A pseudocode implementation is given below:

int missing_Colon_for_Indentation(line)
    if ("if" in line || "while" in line || ...):
        if (line[-1] != ":")
            return len(line)  # the index where the missing colon should be
        else
            return -1 

In the case of nested indentation, a recursive/stack approach may be utilized to identify missing colons. to be completed