Open oonisim opened 2 years ago
@oonisim Hi!
Is this the end of study for this book or any chances to get the data?
May be here is our mailing list https://lists.apache.org/list.html?dev@spark.apache.org
But how to get data in xml format?
And we can download mails as mbox archive (I do not know anything about this format).
The Apache mailing list has changed its interface, and it is not anymore mod_mbox of Apache HTTP, hence url like
http://mail-archives.apache.org/mod_mbox/spark-dev/201911.mbox/ajax/thread?0
will cause the error because of/ajax
part.By removing
/ajax
, the urlhttp://mail-archives.apache.org/mod_mbox/spark-dev/201911.mbox/thread?0
mailing list URL redirect to new interface dev@spark.apache.org, November 2019 but it does not provide MBOX format listing, hence cannot extract the MBOX format elements such as FROM, TO, SUBJECT.The thread ID pattern is now different too, e.g.
https://lists.apache.org/thread/hg85hhvt270of8fdrmb62kfvm7rpl96p
.