echemdb / unitpackage

A Python library to interact with a collection of frictionless datapackages
https://echemdb.github.io/unitpackage/
GNU General Public License v3.0
4 stars 3 forks source link

Implement Makefile in existing code #5

Open DunklesArchipel opened 2 years ago

DunklesArchipel commented 2 years ago

The new Makefile approach (#143) to create datapackages, should probably included in the existing code to prevent code duplication.

see for example: def create_examples(cls, name="alves_2011_electrochemistry_6010"):

DunklesArchipel commented 2 years ago

@saraedum It would be great to see an example of how this works. :)

saraedum commented 2 years ago

It's not clear to me what kind of API you want to support. Let's discuss this in person at some point.

DunklesArchipel commented 2 years ago

From our discussion we would like to implement the following idea:

def create_example(...):
    digitize(source_dir="literature/alves...", target_dir="tmp...")

def digitize(source_dir, target_dir):
    global lock...maybe somehow builtin to make.

    import subprocess
    subprocess.check_call(["make", "--makefile={os.path.join(os.path.dirname(echemdb.__file__), '..', 'data', 'Makefile')}", "-j{multiprocessing.cpu_count()}", "SOURCE_DIR={source_dir}", "TARGET={target_dir}"])

In addition the Makefile should distinguish between the following cases, from which a datapackage is build:

If a datapackage is uploaded to the repo in literature/.... the JSON should be converted to YAML.

Locally the datapackages can be saved right away in data/generated: data/original/… .csv, .json -> data/generated

DunklesArchipel commented 2 years ago

If a CSV is linked to a YAML, the YAML has to contain somewhere a frictionless field description.