Open njmattes opened 8 years ago
Ok. I need to recap some of this stuff. I have it written down sparsely over time, but I'll consolidate on this week. It would be also good to get some extra influence of some preliminary results from @cszc also.
Yes, that's a good idea to coordinate with @cszc. No rush really on these milestones—would just be a shame to lose track of all the potential enhancements.
Hi All. Just seeing these comments now. For some reason, it's been difficult to sort out when I've been tagged... I think I need to filter my alerts a little better.
I'm doing the first interview with Joshua today. Would be happy to go over the results this week or next.
Also @njmattes, I was thinking about the future problem of allowing users to search the metadata. Have you considered using a separate search index, like solr or elasticsearch? I think maintaining a search index would be easier and faster than querying the database and parsing json. We'd have to find someway to injest new metadata into the search index though.
Just a thought. Also, not sure where to share thoughts like these... feel free to point me in another direction if there's more appropriate space for it.
@cszc some time ago I've been looking into Elasticsearch
for geospatial usage, but I've stopped this effort for now, because of our current development. But it is an interesting idea to use some Lucene
based tool to map JSON
data. While attending Elastic-on last year, I saw an application that coupled ES with traditional Oracle 10
database for faster search. Perhaps on a near future, we could test it with our current implementation with Postgres
. @njmattes this also could be somehow part of that micro-language specification.
@cszc Yes, at some point a proper search implementation like that would rip. Perhaps we should start a fresh issue for discussing search implementation. And @cszc feel free to start as many issues as you want to discuss whatever comes up. I'm actually not used to using GitHub's issues, so my apologies if I'm not using this in a way that's useful for everyone.
Steve Tuecke and I are discussing ways in which we can add search and discovery features to Globus. The basic idea is to allow users to associate metadata with arbitrary files and directories on their endpoints (e.g., via ".gmeta" JSON files); we will hoover that up and index it, perhaps via ElasticSearch or similar. We want to start some prototyping in the near future, and then engage some high school students one the summer. I have the idea that we should be able to link that work with what you are discussing here.
Would absolutely rip if this 'metadata searching' could be solved in similar ways in each project. @ricardobarroslourenco you must be happy that Elasticsearch keeps coming up in discussions lately!
Glad to hear this @ianfoster . People atElasticsearch
are doing an interesting job, on several domains. Recently they made public the videos from their last conference, and there are some interesting talks.
@njmattes there is on this site, a talk by Nick Knize, which is the lead engineer in geospatial. He is involved both in Elasticsearch
, but also in the Lucene
project, because all the modules that they use are based on it. And answering you, very happy to see this nice piece of software coming back to our discussions :)
@ricardobarroslourenco We've talked a lot over the past several weeks of dozens of possible future enhancements to EDE, on various timescales from 'as soon as possible' to 'not likely to happen before the end of times'. I think it might be useful to start adding milestones for maybe three or four timeframes for these enhancements, and adding issues to them, so that we don't forget them. What do you think?