Introduction
This connector uses the twitter streaming api to listen for status update messages and
convert them to a Kafka Connect struct on the fly. The goal is to match as much of the
Twitter Status object as possible.
Configuration
TwitterSourceConnector
This Twitter Source connector is used to pull data from Twitter in realtime.
name=connector1
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector
# Set these required values
twitter.oauth.accessTokenSecret=
process.deletes=
filter.keywords=
kafka.status.topic=
kafka.delete.topic=
twitter.oauth.consumerSecret=
twitter.oauth.accessToken=
twitter.oauth.consumerKey=
Name |
Description |
Type |
Default |
Valid Values |
Importance |
filter.keywords |
Twitter keywords to filter for. |
list |
|
|
high |
filter.userIds |
Twitter user IDs to follow. |
list |
"" |
|
low |
kafka.delete.topic |
Kafka topic to write delete events to. |
string |
|
|
high |
kafka.status.topic |
Kafka topic to write the statuses to. |
string |
|
|
high |
process.deletes |
Should this connector process deletes. |
boolean |
|
|
high |
twitter.oauth.accessToken |
OAuth access token |
password |
|
|
high |
twitter.oauth.accessTokenSecret |
OAuth access token secret |
password |
|
|
high |
twitter.oauth.consumerKey |
OAuth consumer key |
password |
|
|
high |
twitter.oauth.consumerSecret |
OAuth consumer secret |
password |
|
|
high |
twitter.debug |
Flag to enable debug logging for the twitter api. |
boolean |
false |
|
low |
Schemas
com.github.jcustenborder.kafka.connect.twitter.Place
Returns the place attached to this status
com.github.jcustenborder.kafka.connect.twitter.GeoLocation
Returns The location that this tweet refers to if available.
Name |
Optional |
Schema |
Default Value |
Documentation |
Latitude |
false |
Float64 |
|
returns the latitude of the geo location |
Longitude |
false |
Float64 |
|
returns the longitude of the geo location |
com.github.jcustenborder.kafka.connect.twitter.StatusDeletionNotice
Message that is received when a status is deleted from Twitter.
Name |
Optional |
Schema |
Default Value |
Documentation |
StatusId |
false |
Int64 |
|
|
UserId |
false |
Int64 |
|
|
com.github.jcustenborder.kafka.connect.twitter.StatusDeletionNoticeKey
Key for a message that is received when a status is deleted from Twitter.
Name |
Optional |
Schema |
Default Value |
Documentation |
StatusId |
false |
Int64 |
|
|
com.github.jcustenborder.kafka.connect.twitter.StatusKey
Key for a twitter status.
Name |
Optional |
Schema |
Default Value |
Documentation |
Id |
true |
Int64 |
|
|
com.github.jcustenborder.kafka.connect.twitter.Status
Twitter status message.
com.github.jcustenborder.kafka.connect.twitter.User
Return the user associated with the status.
This can be null if the instance is from User.getStatus().
Name |
Optional |
Schema |
Default Value |
Documentation |
Id |
true |
Int64 |
|
Returns the id of the user |
Name |
true |
String |
|
Returns the name of the user |
ScreenName |
true |
String |
|
Returns the screen name of the user |
Location |
true |
String |
|
Returns the location of the user |
Description |
true |
String |
|
Returns the description of the user |
ContributorsEnabled |
true |
Boolean |
|
Tests if the user is enabling contributors |
ProfileImageURL |
true |
String |
|
Returns the profile image url of the user |
BiggerProfileImageURL |
true |
String |
|
|
MiniProfileImageURL |
true |
String |
|
|
OriginalProfileImageURL |
true |
String |
|
|
ProfileImageURLHttps |
true |
String |
|
|
BiggerProfileImageURLHttps |
true |
String |
|
|
MiniProfileImageURLHttps |
true |
String |
|
|
OriginalProfileImageURLHttps |
true |
String |
|
|
DefaultProfileImage |
true |
Boolean |
|
Tests if the user has not uploaded their own avatar |
URL |
true |
String |
|
Returns the url of the user |
Protected |
true |
Boolean |
|
Test if the user status is protected |
FollowersCount |
true |
Int32 |
|
Returns the number of followers |
ProfileBackgroundColor |
true |
String |
|
|
ProfileTextColor |
true |
String |
|
|
ProfileLinkColor |
true |
String |
|
|
ProfileSidebarFillColor |
true |
String |
|
|
ProfileSidebarBorderColor |
true |
String |
|
|
ProfileUseBackgroundImage |
true |
Boolean |
|
|
DefaultProfile |
true |
Boolean |
|
Tests if the user has not altered the theme or background |
ShowAllInlineMedia |
true |
Boolean |
|
|
FriendsCount |
true |
Int32 |
|
Returns the number of users the user follows (AKA "followings") |
CreatedAt |
true |
Timestamp |
|
|
FavouritesCount |
true |
Int32 |
|
|
UtcOffset |
true |
Int32 |
|
|
TimeZone |
true |
String |
|
|
ProfileBackgroundImageURL |
true |
String |
|
|
ProfileBackgroundImageUrlHttps |
true |
String |
|
|
ProfileBannerURL |
true |
String |
|
|
ProfileBannerRetinaURL |
true |
String |
|
|
ProfileBannerIPadURL |
true |
String |
|
|
ProfileBannerIPadRetinaURL |
true |
String |
|
|
ProfileBannerMobileURL |
true |
String |
|
|
ProfileBannerMobileRetinaURL |
true |
String |
|
|
ProfileBackgroundTiled |
true |
Boolean |
|
|
Lang |
true |
String |
|
Returns the preferred language of the user |
StatusesCount |
true |
Int32 |
|
|
GeoEnabled |
true |
Boolean |
|
|
Verified |
true |
Boolean |
|
|
Translator |
true |
Boolean |
|
|
ListedCount |
true |
Int32 |
|
Returns the number of public lists the user is listed on, or -1 if the count is unavailable. |
FollowRequestSent |
true |
Boolean |
|
Returns true if the authenticating user has requested to follow this user, otherwise false. |
WithheldInCountries |
false |
Array of String |
|
Returns the list of country codes where the user is withheld |
com.github.jcustenborder.kafka.connect.twitter.ExtendedMediaEntity.Variant
Name |
Optional |
Schema |
Default Value |
Documentation |
Url |
true |
String |
|
|
Bitrate |
true |
Int32 |
|
|
ContentType |
true |
String |
|
|
com.github.jcustenborder.kafka.connect.twitter.MediaEntity.Size
Name |
Optional |
Schema |
Default Value |
Documentation |
Resize |
true |
Int32 |
|
|
Width |
true |
Int32 |
|
|
Height |
true |
Int32 |
|
|
com.github.jcustenborder.kafka.connect.twitter.ExtendedMediaEntity
com.github.jcustenborder.kafka.connect.twitter.HashtagEntity
Name |
Optional |
Schema |
Default Value |
Documentation |
Text |
true |
String |
|
Returns the text of the hashtag without #. |
Start |
true |
Int32 |
|
Returns the index of the start character of the hashtag. |
End |
true |
Int32 |
|
Returns the index of the end character of the hashtag. |
com.github.jcustenborder.kafka.connect.twitter.MediaEntity
com.github.jcustenborder.kafka.connect.twitter.SymbolEntity
Name |
Optional |
Schema |
Default Value |
Documentation |
Start |
true |
Int32 |
|
Returns the index of the start character of the symbol. |
End |
true |
Int32 |
|
Returns the index of the end character of the symbol. |
Text |
true |
String |
|
Returns the text of the entity |
com.github.jcustenborder.kafka.connect.twitter.URLEntity
Name |
Optional |
Schema |
Default Value |
Documentation |
URL |
true |
String |
|
Returns the URL mentioned in the tweet. |
Text |
true |
String |
|
Returns the URL mentioned in the tweet. |
ExpandedURL |
true |
String |
|
Returns the expanded URL if mentioned URL is shorten. |
Start |
true |
Int32 |
|
Returns the index of the start character of the URL mentioned in the tweet. |
End |
true |
Int32 |
|
Returns the index of the end character of the URL mentioned in the tweet. |
DisplayURL |
true |
String |
|
Returns the display URL if mentioned URL is shorten. |
com.github.jcustenborder.kafka.connect.twitter.UserMentionEntity
Name |
Optional |
Schema |
Default Value |
Documentation |
Name |
true |
String |
|
Returns the name mentioned in the status. |
Id |
true |
Int64 |
|
Returns the user id mentioned in the status. |
Text |
true |
String |
|
Returns the screen name mentioned in the status. |
ScreenName |
true |
String |
|
Returns the screen name mentioned in the status. |
Start |
true |
Int32 |
|
Returns the index of the start character of the user mention. |
End |
true |
Int32 |
|
Returns the index of the end character of the user mention. |
Running in development
mvn clean package
export CLASSPATH="$(find target/ -type f -name '*.jar'| grep '\-package' | tr '\n' ':')"
$CONFLUENT_HOME/bin/connect-standalone connect/connect-avro-docker.properties config/TwitterSourceConnector.properties