Hadoop/spark input missing segments

From Caleb:

2016-05-16 10:10:16 [ForkJoinPool.commonPool-worker-14] INFO  BeJobLauncher:192 - Adding data service path for bucket /backup/ttm/master: service_name=Aleph2EsInputFormat options={es.resource=ttm_master__2b5fd691353a/type_1,stats, es.index.read.missing.as.empty=yes, es.query=?q=*}

for a bucket that is comprised of two indexes (ttm_master__2b5fd691353a and ttm_master__2b5fd691353a_1)

I think the problem is when the tmin/tmax are specified, it tries to filter on date, my guess is that it's picking up the segment id as a date and then ignoring it

                final String final_index = getTimedIndexes(job_input, index_type_mapping, new Date())
                                                .map(s -> Stream.concat(s, TimeSliceDirUtils.getUntimedDirectories(index_type_mapping.keySet().stream()))
                                                            .collect(Collectors.joining(",")))
                                            .orElse(index_resource);

Ah looks like candidateTimedDirectories in TimeSliceDirUtils in aleph2_core_shared_library doesn't expect the segment id (which is a purely ES construct)

IKANOW / Aleph2

Hadoop/spark input missing segments #94