rkoval / alfred-aws-console-services-workflow

A powerful workflow for quickly opening up AWS Console Services in your browser or searching for entities within them.
MIT License
312 stars 54 forks source link

Wondering how does the backend elastic search work? #34

Closed MacHu-GWU closed 3 years ago

MacHu-GWU commented 3 years ago

Hi, @rkoval ,

First I want to thank you for the awesome workflow.

I want to know about how does the backend elastic search work. I can think of two solution:

  1. You host a ES server your own and READ-only open for public
  2. Your go code can silently build a local ES index on User's computer and query from it.

May I know What is the case?

The reason I ask is because I created a local ES index typed solution allow developer to bring in their own data in Json and define how to index them in a json settings file, then the Alfred workflow can leverage the Full text search engine for any purpose: https://github.com/MacHu-GWU/afwf_fts_anything-project

I am wondering that if I could take your .yml data file and extend it little bit for my own use (of course I will keep your license file and cite it).

rkoval commented 3 years ago

hey @MacHu-GWU ! thanks for using the workflow! glad you are enjoying it

your full-text project looks pretty cool! i looked at your cloudformation repo, and that seems like it could be very useful to have a uniform place in alfred where you can query software docs throughout the web

regarding how the search works: the workflow utilizes awgo, which in turn uses go-fuzzy for its fuzzy searching. i'm not sure if you are asking if this literally uses Elasticsearch, but the library has all of its search logic self-contained within go code without needing to reach out to a data store. it's a pretty powerful library, but i wouldn't call it full-text search because it doesn't do any sort of language analysis/index building for matching terms (like stemming). the search algorithm just does some clever tricks for matching on the raw strings.

regarding your usage of the .yml file: feel free! credit with a backlink to this repo would definitely appreciated

please let me know if i didn't answer your questions correctly. thanks again!

MacHu-GWU commented 3 years ago

@rkoval thank you for the awesome reply.

I see, then it should be a in-memory matching algorithm without any index.

The one I am using is called https://whoosh.readthedocs.io/en/latest/intro.html, a full text lucent implementation in Pure Python support stemming and tokenize. It performs well when item number >= 1,000,000.

I was wondering if there's an official AWS Console Spec file like this. Because AWS published a spec file for cloudformation https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-resource-specification.html. If there's such spec file available, you may not need to maintain the .yml file manually anymore:

[
 {service_name: EC2, subservice_name: security_group, service_url_id=ec2, subservice_url_id=SecurityGroup},
...
]
rkoval commented 3 years ago

I see, then it should be a in-memory matching algorithm without any index.

yup, i believe this is the case

The one I am using is called https://whoosh.readthedocs.io/en/latest/intro.html, a full text lucent implementation in Pure Python support stemming and tokenize. It performs well when item number >= 1,000,000.

ahh this is very interesting. this seems like a pretty cool and powerful library. a thought though is that i chose golang for this workflow specifically for its speed and that there was an alfred library for it. there actually exists a previous version of this workflow written in python that i worked on with an old co-worker (it's no longer maintained, but you can find it here). however, python's performance got to be too annoying for me because slowdowns were so apparent, so i decided to re-write it in go. python's startup overhead is especially noticeable because alfred re-runs the script on every character you type. there are of course ways around that, like always having a background server running that alfred acts as a client to; however, that felt like unnecessary complexity when this go implementation performs up to my standards just fine without it.

that said, i am not a go expert, so i am unable to recommend a full-text search library that could be comparable to the python one you shared. additionally, there's also the tradeoff that golang can sometimes just be harder to work with when compared to a more flexible interpreted language like python

I was wondering if there's an official AWS Console Spec file like this. Because AWS published a spec file for cloudformation https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-resource-specification.html. If there's such spec file available, you may not need to maintain the .yml file manually anymore:

ooh this is a good tip! i wasn't aware that AWS maintained files for these resources. though, it seems like they're more specific to cloudformation and its API documentation, so i'm not sure if this repo can make much use for them right now. from a brief skim, i don't see any info related to the resources' AWS console counterparts. thanks for the heads up though!

regarding the .yml format, the one i have here is unofficial. when i first started working on this workflow years ago, i was unable to find anything elsewhere on the internet that would help for this purpose. i haven't looked recently though, so there may be something for it (i just briefly googled though and still didn't really find anything). it would be great if i could remove the yml file altogether in favor of an official doc that has all of the information i could need for this workflow. the manual maintenance can definitely get tedious

rkoval commented 3 years ago

i'm doing some cleanup here, so i'm going to close this issue for now. lmk if i didn't resolve your issue or you have more questions. thanks again for using the workflow!