ruby-rdf / rdf

RDF.rb is a pure-Ruby library for working with Resource Description Framework (RDF) data.
http://rubygems.org/gems/rdf
The Unlicense
382 stars 98 forks source link

Parsing of Basic Schema.org #410

Closed createmycookbook closed 3 years ago

createmycookbook commented 4 years ago

How could one use this library to parse basic JSON-LD/Microdata from a webpage in such a way that data is constructed consistently? For example (parsing a recipe. e.g. https://www.allrecipes.com/recipe/16670/crostini-demily):

Looking for a result like:

[ { 
     name: "Crostini D'Emily",
     recipeInstructions: [
       "Preheat oven to 375  degrees F (190 degrees C).", // this is normalized away from a howto step sometimes
       "Slice the baguette crosswise into 1/4 inch thick slices",
       ...
     ],
     recipeIngredient: [
          "1 day old baguette",
          "¼ cup butter, softened",
          "1 tablespoon olive oil",
          "3 cloves garlic, chopped"
        ],

       // other specified schema.org properties parsable as a simple map?
 }]

from a query

graph = RDF::Graph.load("https://www.allrecipes.com/recipe/16670/crostini-demily")
 RDF::Query.execute(graph, recipe: {
       RDF.type => RDF::Vocab::SCHEMA.Recipe
   }).map do |type_solution|
    q = RDF::Query.new do
      pattern([ recipe, RDF::Vocab::SCHEMA.name, :name ], optional: true)
      # the next may be Text or HowTo (how to standardize? Framing?)
      pattern([ recipe, RDF::Vocab::SCHEMA.recipeIngredient, :recipeIngredient ], optional: true)
      ...
    end

Ideas? sorry newbie

gkellogg commented 4 years ago

This gem forms the core data model for describing RDF/Linked Data. As such, it only natively parses core formats N-Triples and N-Quads. There is a whole ecosystem including other parsers, query engines and data storage models.

Try installing the “linkeddata” gem, which includes this and may others. You can then either use the programmatic reader interface, or the “rdf” cli which is also installed.

These are all used in a couple of online services, including rdf.greggkellogg.net, which basically uses the internals of the “rdf” cli, which will likely do what you want.