Some of the currently implemented caching solutions in spark, namely CachedWebCrawlerJob and PatentMetadataRetrieverJob, are relying on RDDs while we could take advantage of the full potential of spark2 dataframes as it was done in TARA caching (CachedTaraReferenceExtractionJob).
Some of the currently implemented caching solutions in spark, namely
CachedWebCrawlerJob
andPatentMetadataRetrieverJob
, are relying on RDDs while we could take advantage of the full potential of spark2 dataframes as it was done in TARA caching (CachedTaraReferenceExtractionJob
).