refactor fetch_metadata using workerpool

peppelinux commented 5 years ago

Very interesting, is there the possibility to refactor configuration and optparse (with argparse) and put also this feature? I think that there would be the Need to separate metadata Fetcher workers by mdx workers. In a production environment would be useful and more efficient to decide how many workers decide to run. Is It sustainable?

leifj commented 5 years ago

Yeah I think so although most people seem to favor generating the MDQ server as a batched set of static files these days and putting it all on a CDN for scalability. I like the idea tho!

peppelinux commented 5 years ago

Ok, good to hear this. I'd like to talk also about the followings.

An SQL/nosql service where pyff's aggregator/validator classifies MD. It should work with an ORM and not with hardcoded queries. Mongodb, mysql, redis, PostgreSQL... in Django we can use all these with a normal and standardized syntax/approach/semantic.
If pyffd MD workers could be batched with an autonomous scheduler like Cron or other not blocking approaches like celery/tornado/Twister all the outputs could be written to db
MDX server could read from db result and answer to http request only

Sorry if I Say Django again but the 3th point Is veeery trivial with Django, nothing more to Say about URL regex patterns and security, It comes with batteries included! Even if db schema Will change Django schema migration sustem Will adapt the schema in semi-automatic way. If we could decouple mdx functions from http frontend I Will finalize a MDX in Django. Probably you already doing this, let me see the way, I Will follow this.

Also: I saw an mdx server in js in thiss project... Is this the final way?

leifj commented 5 years ago

Skickat från min iPhone

29 apr. 2019 kl. 19:47 skrev Giuseppe De Marco notifications@github.com:

Ok, good to hear this. I'd like to talk also about the followings.

An SQL/nosql service where pyff's aggregator/validator classifies MD. It should work with an ORM and not with hardcoded queries. Mongodb, mysql, redis, PostgreSQL... in Django we can use all these with a normal and standardized syntax/approach/semantic. If pyffd MD workers could be batched with an autonomous scheduler like Cron or other not blocking approaches like celery/tornado/Twister all the outputs could be written to db MDX server could read from db result and answer to http request only

I’ve listed some if this in the roadmap but with different words...

Sorry if I Say Django again but the 3th point Is veeery trivial with Django, nothing more to Say about URL regex patterns and security, It comes with batteries included! Even if db schema Will change Django schema migration sustem Will adapt the schema in semi-automatic way. If we could decouple mdx functions from http frontend I Will finalize a MDX in Django. Probably you already doing this, let me see the way, I Will follow this.

Good luck!

Also: I saw an mdx server in js in thiss project... Is this the finale way?

That is a JSON-only mdq uptimized for in-memory search index... not a general tool.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

peppelinux commented 5 years ago

Sorry if I Say Django again but the 3th point Is veeery trivial with Django, nothing more to Say about URL regex patterns and security, It comes with batteries included! Even if db schema Will change Django schema migration sustem Will adapt the schema in semi-automatic way. If we could decouple mdx functions from http frontend I Will finalize a MDX in Django. Probably you already doing this, let me see the way, I Will follow this.

Good luck!

Got It 😜

IdentityPython / pyFF

refactor fetch_metadata using workerpool #19