Closed aliceranzhou closed 8 years ago
I added the API question as an issue here: https://github.com/lintool/warcbase/issues/189
So it doesn't get lost in a pull request.
Good point. I've moved the records into separate classes. The only thing is that they are no longer implicitly converted.
For the error, should we keep it as is or disable the error reporting?
The only thing is that they are no longer implicitly converted.
What are the implications of this? As in, are your scripts going to change?
For the error, should we keep it as is or disable the error reporting?
Keep it for now...
No, the scripts will stay the same. Previously, any ARCRecord
or
WARCRecord
was implicitly converted to a WARecord
. Now, the
RecordLoader explicitly initializes ArchiveRecords
.
On Thu, Dec 10, 2015 at 9:25 PM Jimmy Lin notifications@github.com wrote:
The only thing is that they are no longer implicitly converted.
What are the implications of this? As in, are your scripts going to change?
For the error, should we keep it as is or disable the error reporting?
Keep it for now...
— Reply to this email directly or view it on GitHub https://github.com/lintool/warcbase/pull/188#issuecomment-163816663.
Looks good, thanks! I've merged commit ba2b44c52b35cb026c5665f250d77f4c9b586a80
A few comments:
Playing with
src/test/resources/arc/example.arc.gz
, I'm getting:I believe the error has always been there (something inconsistent in the data itself), but previously it's been masked by the API... so this doesn't seem like something we should worry about?
If I do:
It works, which is what I'd want, so yay! However it gives a type
Array[org.warcbase.spark.matchbox.RecordTransformers.WARecord]
.I'm wondering why
WARecord
is an inner class ofRecordTransformers
?Shouldn't we actually have something like:
org.warcbase.spark.ArchiveRecord
org.warcbase.spark.ArcRecord
org.warcbase.spark.WarcRecord
Note they shouldn't actually be in the
matchbox
package since they aren't UDFs, no? Also,ArchiveRecord
instead ofWARecord
to clearer?