jetstreamapp / soql-parser-js

Javascript SOQL parser
https://jetstreamapp.github.io/soql-parser-js/
MIT License
77 stars 20 forks source link

composeQuery relationship clarification #46

Closed ChuckJonas closed 5 years ago

ChuckJonas commented 5 years ago

First off, awesome work. I started to write my own parser using antlr4ts and the grammar posted by Mulesoft and then (THANK GOD) I came across this pkg. Seems like a really complete implementation.

I do however have a question on how the composeQuery() is suppose to work in regards to relationship fields.

On the field interface, I see the relationshipFields: string[] prop, however, it seems to ignore this array and instead just output whatever is in the text property. Is this intended behavior?

Example:

let qryStr = composeQuery({
    "fields":[
        {"text":"Name"},
        {
            "text":"Name",
            "relationshipFields":['Account']
        }
    ],
    "subqueries":[],
    "sObject":"contact"
});

//OUTPUT: "SELECT Name, Name FROM contact"
paustint commented 5 years ago

@ChuckJonas - sorry for the slow reply, I did not notice your tickets.

Mulesoft and then (THANK GOD) I came across this pkg. Seems like a really complete implementation.

I was in your same boat and after a few fits and starts I got this library put together. I am planning to incorporate it into an application I own, but so far I have only parsed the composed query to get the fields - so it is very likely that some small tweaks may need to be made to the logic or structure. I am very open to your input and am willing to consider and suggestions you might have.

The way this works now is that it only uses the text and the relationshipFields fields is just the split parameters. So in your case you can either simply supply the following:

        {
            "text":"Account.Name"
        }

I was actually questioning the design of this myself - and it is obvious that it is not intuitive. What do you think after seeing my explanation?

Here are some example test cases - notice the first one is odd because Contact is the object queried, so it is actually ignored. (dumb example, but was in the SFDC docs and it is actually a valid query - but I have no idea why someone would do it this way.)

  {
    testCase: 11,
    soql: 'SELECT Contact.FirstName, Contact.Account.Name FROM Contact',
    output: {
      fields: [
        {
          text: 'Contact.FirstName',
          relationshipFields: ['Contact', 'FirstName'],
        },
        {
          text: 'Contact.Account.Name',
          relationshipFields: ['Contact', 'Account', 'Name'],
        },
      ],
      subqueries: [],
      sObject: 'Contact',
    },
  },

  {
    testCase: 12,
    soql: "SELECT Id, Name, Account.Name FROM Contact WHERE Account.Industry = 'media'",
    output: {
      fields: [
        {
          text: 'Id',
        },
        {
          text: 'Name',
        },
        {
          text: 'Account.Name',
          relationshipFields: ['Account', 'Name'],
        },
      ],
ChuckJonas commented 5 years ago

I think that just using the text is fine, and maybe more flexible... I would recommend updating the types on compose() so that it doesn't expect values that aren't going to be taken into account.

The simplest way to do this is to probably use the Pick<> mapped type...

maybe something like:

type ComposeField = Pick<Field, 'text', 'subqueryObjName'>;
paustint commented 5 years ago

@ChuckJonas - thanks for the suggestion, I have not used Pick<> so I will need to read the docs.

ChuckJonas commented 5 years ago

I was thinking more about this, and I think long term it would be really nice if the select types were more robust. EG instead of just putting every possible field in a single type, which allows you to build invalid combinations, do something like this:

type ParentField = {relationships: string[], field: string};
type FunctionField = {fn: SOQLFunction, params: string | string[], alias?: string};
type SelectSubquery = ... subset of Query that is valid in Select
type Select = string | ParentField | FunctionField | SelectSubQuery;

A select might look like this

typescript

{
    "fields":[
        'Name',
        {fn: 'COUNT', params: 'Id'},
        {relations: ['Account'], field: 'name'},
        {
              fields: ['id']
              from: 'Activities'
        }
    ],
    "sObject":"contact"
});

Just something to think about

paustint commented 5 years ago

@ChuckJonas - interesting idea and I will think about it. I would probably shy away from having the "String" as a type because that does not distinguish a string from a boolean from a number from a data literal. I would prefer to have every type as an object type and I would prefer to also have some very concrete data point that says exactly what the field type is so a consumer knows how to use the data.

I will think about it over the next week or two and iterate to see what works. There will more than likely be some breaking changes and I am not yet using semantic versioning until I get solid foundation and decide to release a version 1.0 - so keep an eye out as it may have impacts for you :)

Thanks again for your feedback - please open any issues for things your find or questions you have, as I value your feedback!

ChuckJonas commented 5 years ago

The string is nice for ease of use and flexibility when composing queries.

Also, how do you plan to infer type information? Seems like you would have to pull down metadata...

IMO, it would make much sense to build a separate package on top of this to do type checking and leave this a just a pure parser/composer/formatter.

paustint commented 5 years ago

@ChuckJonas - what I meant by type was not the type of the SFDC field, but the type of the data structure.

Also, while a "string" provide ease of use when composing a query, for a consumer to make sense of each field from a parsed query, there will need to be all sorts of combinations of data checks (currField typeof 'string', currField typeof 'object' && currField.fn, currField typeof 'object' && Array.isArray(currField.relations), etc..). It would be much easier is there was some information about the data type of a field to allow parsing like currField.type === 'FunctionExpression' or by using a switch statement.

paustint commented 5 years ago

@ChuckJonas - here is what I am thinking.

Some data points on the field are useful if a SOQL query is parsed, but are irrelevant if a consumer is building a composed data structure, in which case the properties are benign. I have called out such cases below (e.x. isAggregateFn rawValue - you would not need to populate those, but it is useful information to know from a parsed query because SFDC returns the data differently if there is not an alias on an aggregate field)

export interface Field {
  type: 'Field';
  field: string;
  objectPrefix?: string;
}

export interface FieldFunctionExpression {
  type: 'FieldFunctionExpression';
  fn: string;
  parameter: string;
  alias?: string;
  isAggregateFn?: boolean; // not required for compose, will be populated if SOQL is parsed
  rawValue?: string; // not required for compose, will be populated if SOQL is parsed with the raw value of the entire field
}

export interface FieldRelationship {
  type: 'FieldRelationship';
  field: string;
  relationships: string[];
  rawValue?: string; // not required for compose, will be populated if SOQL is parsed with the raw value of the entire field
}

export interface FieldSubquery {
  type: 'FieldSubquery';
  from: string;
  subquery: Query;
}

export interface FieldTypeOf {
  type: 'FieldTypeof';
  field: string;
  conditions: FieldTypeOfCondition[];
}

export interface FieldTypeOfCondition {
  type: FieldTypeOfConditionType;
  objectType?: string; // not present when ELSE
  fieldList: string[];
}

export type FieldType = Field | FieldFunctionExpression | FieldRelationship | FieldSubquery | FieldTypeOf;

export interface Query {
  fields: FieldType[];
  sObject?: string;
  sObjectAlias?: string;
  sObjectPrefix?: string[];
  sObjectRelationshipName?: string;
  where?: WhereClause;
  limit?: number;
  offset?: number;
  groupBy?: GroupByClause;
  having?: HavingClause;
  orderBy?: OrderByClause | OrderByClause[];
  withDataCategory?: WithDataCategoryClause;
  for?: ForClause;
  update?: UpdateClause;
}
paustint commented 5 years ago

@ChuckJonas - one other idea I have is to provide some utility functions for compose to soften some of the sharp edges when a compose is being built - A possible example using your suggested format:

getComposedFieds(
    'Name',
    {fn: 'COUNT', params: 'Id'},
    {relationships: ['Account'], field: 'Name'}, // alternatively this could even be 'Account.Name'
    {(...subquery...)}
);