Closed cjihrig closed 1 year ago
Proposal 1 - Parameterize BatchSize and MaximumBatchingWindowInSeconds
At the moment I have to manual edit these values every time I push an update, so anything that parameterizes these values is a welcome change
Proposal 2 - Querying enhancements
A few years back I implemented my own solution to this - sharing in case its helpful. The nextToken stuff I needed to modify to use search after. I also had sort as an array. But I also put in a string type so that the resolver could sort keyword or other data types on a field by field basis.
I also implement dynamic authorization within the request resolver rather than rely on the dynamic auth in the @auth.
type Query {
searchAssetsX(
searchAfter: [String],
limit: Int,
sort: [SearchableAssetSortInput],
from: Int ): SearchableAssetConnection
}
input SearchableAssetSortInput {
field: SearchableAssetSortableFields
direction: SearchableSortDirection
string: SearchableStringSort
}
enum SearchableStringSort {
keyword
none
}
Proposal 3 - Upgrade AWS ElasticSearch version to 7.10 for new projects
What do we do with existing projects? Will this upgrade to 7.10 or should we do that from the ElasticSearch console for existing projects?
Proposal 4: AWS ElasticSearch field updates
Not sure this is the right place to put it - but a major lacking thing is that the schema is not enforced in ElasticSearch. For example, the default field _lastChangedAt on a conflict enabled database is an Int in the schema/DynamoDB - but gets created as a Float in ElasticSearch. Another example might be if I have a string field in the schema that has a value in the first record we sync of "1.234" - it is created as a Float so if the second string record is "one point two three four" it fails as it is of the wrong type.
The index mapping needs to be explicitly defined for ElasticSearch and not inferred.
PUT /asset/_mapping/doc
{
"properties": {
"_lastChangedAt":{"type":"integer"}
}
}
Hi, thank you for creating this RFC. I thought the project is nearly dead, so I'm happy to see the progress on this.
Please consider add this one into the RFC, https://github.com/aws-amplify/amplify-cli/issues/5121#issuecomment-675568032
currently, the value of total
aggregation is definitely useless.
I am happy there is work continuing on searchable. I haven't really been able to use it because I cannot mock with it so hopefully that will be addressed in some fashion. That being said, hopefully there could be some warnings if you accidentally comment out @searchable from the schema and deploy...which I have sadly done more that once.
What else Mapping parameters are you going to support in the @searchable
? I'd like to see some other parameters like boost
because currently you can't specify this in the query.
I feel like count of total items returned by a search query should also be part of this RFC. It's a very large limitation and constantly requested.
Proposal 1: Backfilling - the title sounds like its going to allow for automatically populating the index if @searchable is added to an existing schema with data already in it. But then the text makes no mention of that. Can you confirm. The method to do that currently is very manual and easily over looked. Would love it if it was an option to reindex everything when added via the CLI.
Proposal 6: Mocking - Allow for an opt out of @searchable when mocking. When mocking locally I don't always need the ES stuff, often I'm trying to work on a Lambda which is where I use it the most. If I've got @searchable in my schema would I be required to have a local ES setup to mock at all? The way @searchable won't let you mock at all at currently without commenting it out of your schema is stopping speedy mocking locally. I'd rather there was a way to just say "ignore @seachable stuff locally" and still let me mock the rest. This is what was requested originally https://github.com/aws-amplify/amplify-cli/issues/5981 I think getting mocking of @searchable working is also a very good addition, I just want to way to skip it if I don't need to mock it for development speed and yet another thing to keep updating/learn about.
Hi Everyone - thought I'd share this comment on another related issue showing usage for custom searchable resolvers
aws-amplify/amplify-category-api#405
Would be nice, if @searchable and Amplify CLI allowed devs to use external search services - Algolia, because of AWS Elastic/Open search pricing: https://github.com/aws-amplify/amplify-cli/issues/3860 (closed for some reason) and its non-serverless nature.
A very high-level overview of what I mean is that:
E.g. during amplify init, or Amplify update api, CLI would ask if we want to edit search logic manually - if yes, it would give you a couple of options:
• 1. Ignore @searchable during mocking on the local machine • 2. Use @searchable during mocking on the local machine (requires local ES/OpenSearch setup) • 3. Use your own search service (2 empty lambda functions will be generated) • 4. Override/specify ElasticSearch name/resource, that should be used for current env (e.g by using arn?)
Especially 3 is important for me.
These 2 Lambda functions, would be created by running the same flow as when running 'amplify add the function'.
I am already doing that manually, but if searchable was implemented like this, it would require less effort and time.
So if searchable finally added support for non-AWS services in general and finally fixed local mocking (which should be fixed on the day when searchable was introduced), it would be nice, I guess.
BTW, there are some forks that modifies searchable with custom logic to target non-AWS endpoints: https://github.com/starpebble/amplify-cli/wiki
Elastic/Open search pricing: https://github.com/aws-amplify/amplify-cli/issues/3860 (closed for some reason) and its non-serverless nature.
Just a side-info: Serverless OpenSearch service is coming (now its preview stage).
Is serverless opensearch support coming any time soon?
It's already available I think. https://aws.amazon.com/opensearch-service/pricing/#Amazon_OpenSearch_Serverless
But:
You will be billed for a minimum of 4 OCUs (2x indexing includes primary and standby, and 2x search includes one replica for HA) for the first collection in an account.
here are more opinions on this: https://www.reddit.com/r/aws/comments/zh2z4o/serverless_opensearch_seems_like_a_huge_deal_but/
Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
This is a Request For Comments (RFC). RFCs are intended to elicit feedback regarding a proposed change to the Amplify Framework. Please feel free to post comments or questions here.
This document outlines a number of new features and improvements to the
@searchable
directive in Amplify CLI. The goal is to address some of the enhancement requests and bugs.Proposal 1: Backfilling improvements
Exposing the streaming function:
Parameterize BatchSize and MaximumBatchingWindowInSeconds
Github Issues
The
EventSourceMapping
parameter of the Lambda function that streams data from the DynamoDB table to AWS ElasticSearch assumes a batch size of one. This enables the Lambda to update documents as soon as possible in AWS ElasticSearch. This is slow if you are trying to migrate a large amount of data. Moving forward, theBatchSize
andMaximumBatchingWindowInSeconds
will be parametrized to be overridden as desired. The default values will remain the same as the current values.Proposal 2: Querying enhancements
Sorting & Pagination
Github Issues
In the current schema generation, the Amplify CLI generates the sort input as an object. This only supports sorting over a single field. It is currently not possible to introduce another ‘tie-breaker’ field. This makes sorting more difficult when the field is not unique.
To overcome this limitation, this RFC proposes using an array for sorting. The
nextToken
field can remain a string that is encoded and decoded to an array as needed. The following example demonstrates how this will work. This is a breaking change.An example query is shown below:
The corresponding AWS ElasticSearch DSL query is shown below:
Aggregates
Github Issues
Search queries will also include an
aggregates
input to run aggregations on the data. Each supported aggregation will require different GraphQL types in order to accommodate various aggregation output signatures. Initially, only term, avg, max, min, and sum will be supported, but other aggregations can be added in the future. An example searchable schema for term aggregation will look like this:An example search query is shown below:
An example query response is shown below:
Proposal 3: Upgrade AWS ElasticSearch version to 7.10 for new projects
Github Issues
The AWS ElasticSearch version will be set to 7.10 for all new Amplify projects.
Proposal 4: AWS ElasticSearch field updates
Support for GraphQL enums
Github Issues
The Amplify CLI does not currently support searching using enum types. Going forward the fields will be written as strings and searched on as
keyword
fields.All AppSync scalar types are streamed to AWS ElasticSearch
Github Issues
There are some gaps that need to be cleared to complete support for AppSync Scalar Types. Support for the following fields will be included to search over
Date/Time Types
AWSDate
AWSTime
AWSDateTime
The searchable input will use strings, but work with range queries with string formats defined by ISO 8601 . The field will be assumed to be dynamically typed as date.
String Types
AWSEmail
AWSPhone
AWSURL
Since these data types are typed as keyword fields, the generated GraphQL input would be identical to Strings.
Numeric Types
AWSTimestamp
AWSTimestamp
s will be mapped to AWS ElasticSearch dates representing the number of seconds since the epoch.IP Type
AWSIPAddress
AWSIPAddress
inputs will be converted to IP types in AWS ElasticSearch.Other Types
AWSJson
These fields will be treated as text data in AWS ElasticSearch and strings in GraphQL.
Proposal 5: Custom mapping support
In some cases it is desirable to customize the way that GraphQL fields are mapped to AWS ElasticSearch. For example, sensitive information may need to be redacted, or a field may need to be mapped as a different data type. This RFC proposes a field level directive used to modify the mapping functionality, as shown below. In this example, the
name
field would not be indexed by AWS ElasticSearch, while theemail
field would (redundantly in this case) be mapped as a string.Note: The name
@searchableField
and its API are still under discussion and subject to change.Proposal 6: Local mocking
Github Issues
There are three primary hurdles to mocking @searchable:
$util.transform.toElasticsearchQueryDSL()
will need to be implemented as well.