josue-rojas / csv-yaml

a python script to convert csv to yaml files using PyYaml
23 stars 22 forks source link

How to remove single quotes from string-based key/values #11

Open mathewfer opened 6 years ago

mathewfer commented 6 years ago

Hi,

If a column of the CSV file contains numbers, such as 200, 300 etc, generated output of the YAML file adds single quotes for the value (not for the key )as below but the original valve is without any quotes.

I see only numbers gets the single quotes and looks like it assumes it as a string.

Can I please know whether there is a way to get the values as it is from the CSV file, without these single quotes?

- group: ABC
   number_1: '200'
   number_1: '300'
   value_1 : good1
- group: CDE
   number_1: '201'
   number_1: '301'
   value_1 : good2

Awaiting your reply soon,

Mathew

josue-rojas commented 6 years ago

I was testing some stuff and usually this happens when passing a string. which is kind of weird since the other strings do not have quotes. I am guessing it has to do how yaml handles strings that can be integers. i found this, but their problem was with strings. I was thinking of adding a int parse before dumping it but it will throw a bunch of error since most are strings.

I will look more into it but wanted to share what I have found on it so far to see if it is helpful to getting closer to the solution.

josue-rojas commented 6 years ago

I found this which just tries to parse a string to int and return int if it succeeds or the regular value if it didn't. then I overided the zip function in python to try to parse each element.

I added this before the first function.

def intTryParse(value):
    try:
        return int(value)
    except ValueError:
        # print('return reg')
        return value

def zip(*iterables):
    # zip('ABCD', 'xy') --> Ax By
    sentinel = object()
    iterators = [iter(it) for it in iterables]
    while iterators:
        result = []
        for it in iterators:
            elem = next(it, sentinel)
            # print(intTryParse(elem))
            if elem is sentinel:
                return
            elem = intTryParse(str(elem))
            result.append(elem)
        yield tuple(result)

I think you can go to the pyyaml directory and change how it handles each object like strings, ints bytes. In representer.py I think is where each thing is defined to how it would look like in yaml. That would be more work to try to figure out their path of their code. But I think the idea would be similar in that you would need to parse the string to be an int but also have the try for others.

mathewfer commented 6 years ago

Hi

Thanks for checking on this and I too found it to be type "int" and "string" issue. I will try adding your code and let you know. Alternatively, I am thinking to add a way to take this quotes out from the generated file and create a new one without the quotes.

Mathew

josue-rojas commented 6 years ago

I also think if you know where there Is suppose to be ints you can just parse the value in whatever you will be using the file in, instead of checking the whole file for quotes. anyway tell me how it goes and if you found a good solution.