BrickSchema / py-brickschema

Python package for working with Brick (brickschema.org/)
Other
55 stars 15 forks source link

Brickify help usage and operations #70

Closed Pythonwuerger closed 3 years ago

Pythonwuerger commented 3 years ago

Hello this is Philip H. from the Google Group

I looked through the Brickify documentation and I need to reach out for help.

The first thing I have trouble with are the handlers and operations. I get that there are different handlers that can be used, but do I understand right that I have to crate the operator myself?

I understand the basic way the operator is built for the example, but I fail to decipher the deeper logic behind it to be able to apply it to a vastly different data set. Even worse, the logic my data set has is even harder for me to understand and it doesn't have useful column names like the example. Instead it looks like this:

ahu1 only.csv

This is the data, I can see the most logic in it. It would be nice to use the other two as well, but they seem even harder to decode.

Example.xlsx Example2.xlsx

Coding an operator seems to prove very difficult. Are there any prebuilt operators that could be used a base? If I split the date by delimiters would it be easier to write an operator? Is there is more info or a tutorial available. For example why is there the data operation twice in the template?

Also, what handler would be recommended for these formats? I was told the structure of the first data set should be aligned with the haystack format, but I wan't able to confirm it as of yet. How do I convert it to a haystack turtle so that I can load with the haystack handler?

About the command: brickify sheet.tsv --output bldg.ttl --input-type tsv --config template.yml

When running it with other data and template than the examples I get an error:

"'charmap' codec can't decode byte 0x81 in position 462" What does this error message mean?

EDIT: The example sheet in the documentation seems to be already normalized data. It seems I need to go back one step. Since the documentation mentions that it's a job for a technical support team, is it feasible as a one man job? What would you think how long it would take to normalize the data and write a full operator? I noticed that the first data looks a bit similar to the test data Gabe used in his OpenRefine + BrickBuilder video. What is the best way to make this data compatible with Brickify?

I would be very thankful for any help.

gtfierro commented 3 years ago

@shreyasnagare can you take a look at this?

shreyasnagare commented 3 years ago

What’s happening?: Consider the following table:

object-name object-tag-text
B1 Hchy
B1’A Hchy

When you write an operation like:


operations:
  -
    description: "Adding tag-texts"
    data: |-
      bldg:{object-name} <internal-association-predicate> bldg:{object-tag-text} .

brickify tries to create an URI from bldg:{object-name}, which ends up looking like bldg:B1’A for the second row. The problem here is that the single quote here is not URL-safe.

If you already know what unsafe characters the headers/values might have, you can use the replace_dict config (example) to let brickify handle the regex string replacements.

The safer approach: @Pythonwuerger If you don’t know all the characters that might cause problems during serialization, a better idea would be to url-encode all the cells before using brickify. You can use urllib.parse.quote_plus or urllib.parse.quote for this.

Also, I see that the last few column headers have the value “null”. If you wish to access that detail from any of the rows, you might wanna rename the headers.

I’m planning to add that option (auto url-encode) as part of the config sometime next week. Until then, you can try and see if urllib works for you.

Pythonwuerger commented 3 years ago

What's happening? : ... well I'm not sure myself to be honest. Thanks for explaining that error. I'll refrain from using unsafe characters. I couldn't get urllib.parse.quote to work yet as it only converts strings, not an entire .csv. I might be able to work out something in combination with pandas though.

There are further Issues for me:

How come the operations from your example link are built different than the the ones in the documentation?

example

data: |-
      bldg:{Device and Equipment Name} a brick:VAV .

vs

documentation

data: |-
      bldg:{VAV name}_0 rdf:type brick:VAV 

Which format works or is better? Are they inter-exchangeable? What does the "bldg:" and the "_0" do?

I tried to manually change the table format so something like this:

Building Name AHU name Energy Recovery Pump Command
B1 AHU1 AHU1_Erc AHU1_PU AHU1_Cmd

Or should I keep more of the tree format? Ex. B1_AHU1_PU Technically the Pump is part of the Erc and there are more pumps in the Ahu. Could this cause issues? Maybe it should be AHU1_Erc_PU_Cmd.

So if I would want to add the AHU from my files could I do it like this?: The AHU also has a lot of parts, such as a Pump for energy recovery commands (It has many other commands and pumps as well, so I think the upper setup might be too ambiguous, so perhaps should I keep the tree format Ex. B1_AHU1_Erc_PU)


operations:
  -
  description: "adding building"
    data: |-
     bldg:{Building Name}_0 rdf:type brick:Building.

 -
  description: "adding AHU"
    data: |-
     bldg:{AHU name}_0 rdf:type brick:AHU.

 -
  description: "adding Erc"
    data: |-
     bldg:{Energy Recovery}_0 rdf:type brick:("can't find a fitting one").

 -
  description: "adding Pump"
    data: |-
     bldg:{Pump}_0 rdf:type brick:Pump.

 -
   description: "adding Command"
    data: |-
     bldg:{Command}_0 rdf:type brick:Command.

Running this alone yields an error (even without the Erc part).

"yaml.parser.ParserError: while parsing a block node expected the node content, but found '?' in "OP2.yml", line 6, column 3"

I don't understand. Whats wrong in line 6? It's just "description: "adding building". It also provides a column... but the yml doesn't have columns?

Now even if it would work, the relationships aren't mapped right? I don't see any mapping in the example, in the documentation however I find:

- template: |-
      {{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature sensor'], value['sensors'], "brick:Temperature_Sensor") }}

  - template: |-
      {{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature setpoint'], value['setpoints'], "brick:Temperature_Setpoint") }}

macros:
  - |-
    {% macro num_triples(subject, predicate, name, num, type) %}
        {% for i in range(num) %}
          bldg:{{ name }}_{{ i }} a {{ type }} .
          bldg:{{ subject }} {{ predicate }} bldg:{{ name }}_{{ i }} .
        {% endfor %}
    {% endmacro %}

I'm not sure how this would be used to link these parts together. Even harder there are many many more like a ventilators, preheaters, vales, all part of different subparts of the AHU with their own settings, sensors and commands.

If my understanding is correct I think the relationships of the simplified exert should be : B1 has equipment AHU1 has part Erc has part Pump has point Command Please correct me, if i misunderstood something. How would the template and Macros needed to be built, to correctly map the relationships?

shreyasnagare commented 3 years ago

@Pythonwuerger, sorry about the delay. For the CSV file you uploaded, can you share some operations that you're trying to work with? From what I can see, the headers that you can work with here are: keyname, object-name, device, obj.-instance, object-type, object-instance, description, present-value-default, min-present-value, max-present-value, settable, supports, COV, hi-limit, low-limit, state-text-reference, unit-code, object-tag, object-tag-text,.

Feel free to respond with the triples that you're trying to get, for example:

where you assume {object-name} will take values from the object-name column on the CSV, creating similar triples for all the (198) object-names on the file. This will allow me to help you build the brickify configuration you need.

I have tried to fix some of the problems you were facing in PR https://github.com/BrickSchema/py-brickschema/pull/75.

Pythonwuerger commented 3 years ago

Since there are the three datasheets I have are very different from each other, I tried to work with an example which lead me to the other issue #74. As of right now, issue #74 is my more specific problem while this issue is more about understanding.

As I understand each of the datasheets I linked would need a custom operator. That's why I think it would be easier to bring them into a unified format and write an operator customized to the unified format. Please correct me if I'm mistaken.

The most important headers are keyname/object-name. I believe both columns are the same. The descriptor column is very helpful for human readability. The tree format has a lot of redundant information, I believe it's probably necessary to first split the column by the " ' " delimiter.

The ahu1 only.csv is only an excerpt of one dateset. There are many more AHUs and buildings. Ideally I'd like to brickify all the data, but for a start a proof of concept is enough for me.

For example the first entry B1 would be a building. So this should be assigned the building Brick. Then this building has HVAC systems which would be the "A" in the line three "B1'A". So building has Equipment HVAC. (I'm not entirely sure if it's really HVAC, since the description roughly translates to "ventilation & thermal environment". Then this HVAC has multiple AHU starting with the "AHU1" from line four "B1'A'AHU1". These AHU have parts and points themselves.

For example it has shutoff dampers, the "DmpSfOa" in line 13 "B1'A'Ahu1'DmpSfOa", which itself has a command, command value with "Cmd" and "CmdVal".

There are parts of the AHU that are currently of special interest. These would be:

26 : B1'A'Ahu1'Erc'Pu'KickFnct.OpSta The energy recovery (Erc), has a pump (PU), which has kick function (KickFnct) that has it's operational staturs (OpSta) surveilled. 38 : B1'A'Ahu1'Erc'Vlv The energy recovery (Erc), has a valve (Vlv).

42 : B1'A'Ahu1'FanEx'Cmd The AHU has an exhaust fan (FanEX), which is controlled by a command (Cmd) 43: B1'A'Ahu1'FanEx'DP The exhaust fan has a difference pressure sensor (DP) 45 : B1'A'Ahu1'FanEx'Mdlt It also has a constant activation piloting (Mdlt)

52 : B1'A'Ahu1'FanSu'Cmd Same for the supply fan (FanSu) 53 : B1'A'Ahu1'FanSu'DP 55 : B1'A'Ahu1'FanSu'Mdlt

123 : B1'A'Ahu1'PreHcl'Pu'KickFnct.OpSta Then there's also a preheater (PreHcl) with a Pump, that has a kick function that has it's operational status surveilled. 131 : B1'A'Ahu1'PreHcl'Vlv The preheater also has a valve

183 : B1'A'Ahu1'TOa The AHU uses the outside temperature data (TOa) 189 : B1'A'Ahu1'TSu It also checks the temperature of the supply air (TSu)

As you can see, the entries I'd like to have mapped are in a quite confusing format. In addition for a lot of those parts or commands I was unable to find a fitting Brick Class. This is also why I just used the more simple data of a heating group in the other issue. Right now I'm more interested in getting the sample data and my prototype operator to create me a turtle that isn't empty. That way I have something I can build upon.

shreyasnagare commented 3 years ago

@Pythonwuerger As a starting point, you can try to do something like this:

https://github.com/BrickSchema/py-brickschema/blob/42acd46eef73731db0428503eade325fac38c554/brickschema/brickify/src/handlers/Handler/RACHandler/conversions/rac.yml#L227-L258

Adding the replace_dict might help with the serialization issues (L258 here should convert B1'A'Ahu1'FanEx'Cmd to B1_A_Ahu1_FanEx_Cmd).

Pythonwuerger commented 3 years ago

As it turned out, it is possible to operate an entire object-name (e.g. B1'A'Ahu1'FanSu'Mdlt) as an instance. With the help of conditions the keys can be assigned to specific bricks depending on how the object-name ends. For a lot of notations of my datasets it is first necessary to find out which Brick fits best, or even to define additional ones.