As a part of my investigation into experiment lineage API performance it was clear that we could move towards a streaming approach for lineage results as opposed to loading everything into memory before sending down the wire. These changes stream the lineage API payload after the initial lineage query is performed allowing for batched resolution (and discard) of experiment objects (i.e. ExpData, ExpMaterial, etc.).
Changes
Refactor ExpLineage.toJSON() into a new ExpLineageServiceImpl.streamLineage() which streams the lineage results without building up the entire JSON response in memory.
Convert ExpLineage.Edge from a class to a record. Inline the least useful "no role" role that all of these edges have.
Introduce ExpLineage.Edges (note the trailing "s") to make it more conceptually clear the parents and children edges.
Update ApiJsonWriter interface to allow for starting and ending JSON objects.
Introduce a new ExpLineageService to house experiment lineage related services.
The old standard of ExpServiceImpl has eclipsed 10k LOC and just feels like it needs to be broken out. That said, I didn't fully refactor out everything that is "lineage related" but more just the primary methods for querying and streaming lineage results.
Rationale
As a part of my investigation into experiment lineage API performance it was clear that we could move towards a streaming approach for lineage results as opposed to loading everything into memory before sending down the wire. These changes stream the lineage API payload after the initial lineage query is performed allowing for batched resolution (and discard) of experiment objects (i.e.
ExpData
,ExpMaterial
, etc.).Changes
ExpLineage.toJSON()
into a newExpLineageServiceImpl.streamLineage()
which streams the lineage results without building up the entire JSON response in memory.ExpLineage.Edge
from a class to a record. Inline the least useful "no role" role that all of these edges have.ExpLineage.Edges
(note the trailing "s") to make it more conceptually clear the parents and children edges.ApiJsonWriter
interface to allow for starting and ending JSON objects.ExpLineageService
to house experiment lineage related services.ExpServiceImpl
has eclipsed 10k LOC and just feels like it needs to be broken out. That said, I didn't fully refactor out everything that is "lineage related" but more just the primary methods for querying and streaming lineage results.