sing-group / bdp4j

Big Data Pipelining For Java
GNU General Public License v3.0
8 stars 1 forks source link

can you give me a spark data process demo? #1

Open aixuedegege opened 5 years ago

aixuedegege commented 5 years ago

I want to use Spark to create a data processing process, starting with downloading pictures, then scaling and cutting pictures and so on. How can I use your framework ?May you give me a demo? Thanks a lot!

moncho-mendez commented 5 years ago

Hello

Sorry about the delay but we have to do some source to attend your request.

We have not integrated this product with Spark and unfortunately, we have not a demo (but we have made an example as quick as we could and is attached). This project is a simple pipeline implementation derived from the pipeline of Mallet (some source has brought from there) with some interesting features (some of them to appear). Interesting features are:

Although basic pipe functionalities work, we are still developing most of the interesting functionalities. As you imagine (by reading the description provided), there is a lot of work to do.

I submit one example here to preprocess SMS messages extracted from http://www.esp.uem.es/jmgomez/smsspamcorpus/. This is very simple but you can find in the example several pipes of different pipes working together. We have also integrated the example into the source (repository). In the example, we use simple data but you can extract properties in a more complex form.

Hope you can find our project useful. We are working hard to complete more functionalities but our team is small (one Grade Student, one PH.D. Student and a teacher with lots of things to do) (so we work slow). But we hope we can make the entire functionality working before summer (August).

Thanks for your interest! Below the example.

bdp4j_sample.zip

PD. If you finally use bdp4j, we would appreciate if you let us know about your project. Of course, we could solve your doubts and help to get everything working.

With best regards. bdp4j Team (Yeray, María & Moncho)