-
### Description
The recently introduced `_ignored` meta field (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-ignored-field.html) is helpful to detect issues with data inge…
-
**NOTE AS OF 2023-06-12** this issue scope changed but has been kept open and updated. See the bottom of the thread for more context.
We should consider areas for improvement regarding interactions…
-
In https://github.com/elastic/elasticsearch/pull/96083, we've added a JSON parsing pipeline which is currently not enabled by default.
We didn't enable JSON parsing by default as it comes with a ri…
-
Construct Genomic Data Commons (GDC) data module exposing the various GDC projects for querying and rapid ingestion into the Bayesian Knowledge Base (BKB) architecture. Module will include at a minimu…
-
**Describe the bug**
When i trying create datasource from file with xls&xlsx type, on step "preparing data", i get this error: "INGESTION_FILE_EXCEL_CONVERSION_ERROR", but parsing was successful.
…
-
Hi my name is Ansou FALL and im working for opensee.io as a Devops engineer.
I'm running a clickhouse operator in AKS(Azure Kubernetes Service).
We are facing tricky issue issue on clickhouse when…
-
Need for data ingestion rules including gathering more metadata or information around the species lists being loaded including the granularity of the data e.g. species level or subspecies level to ena…
-
1. Set up scheduler so that these jobs can run automatically. (recommend airflow on AWS or AWS Glue or Databricks scheudler)
2. Use an ETL orchestrator to create script that decrypts files in paralle…
-
Look into running this pipeline _within_ the CistromeDB file processing pipeline. This would allow the processed multivec files to stay in sync with CistromeDB and Cistrome Toolkit, for example upon n…
-
Currently, this plugin splits the initial setup to two parts, [register](https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/v11.22.7/lib/logstash/outputs/elasticsearch.rb#L282) and…