rmetzger / stratosphere-sql

My private playground to develop SQL support on Stratosphere
Apache License 2.0
4 stars 2 forks source link

New JsonSchema questions #20

Open rmetzger opened 10 years ago

rmetzger commented 10 years ago

Hi,

I received the following questions that I want to publicly answer here:

Architecture Questions

Stratosphere-sql recives the json-schema folder and parses each schema(JsonSchema) which has file information, fields for CSV format etc. Stratosphere-sql sends that to the relevant adapter which takes the fields and creates a typefactory (in the case of CSV, for avro we will read the avro file and do this)

I added documentation to the SchemaAdapters in this commit: https://github.com/rmetzger/stratosphere-sql/commit/fe4adaebc15e2b897c466417b5a405ada83a69fe

Suggestions

We dont need JSONSchemaUtils because most of the utilities are not generic to JSON but rather are for CSV I belive we can migrate that to CSVScemaUtils or something similar.

I don't see any method in the utils that are specific to the CSVSchemaAdapter. You will also need the methods in the utils to access certain configuration values in the Avro schema, for example the filePath (which can be accessed using getStringField().

also atleast for the development time can we remove the checkin-style ?

You mean the maven-checkstyle-plugin plugin? I understand that the plugin is annoying, but if you configure your environment correctly, there should not be any errors (I think you have to disable star-imports for IntelliJ).

Which rules bother you?

rmetzger commented 10 years ago

Now I understand why you were asking regarding the checkstyle plugin. Since you are using IntelliJ, it is really a problem if maven is unable to build. (afaik, intellij is compiling using a regular maven build). Please let me know if you have any issues with your IDE. I know that this is a complex project with many weird dependencies.