wso2 / analytics-apim

Analytics for APIM
Apache License 2.0
55 stars 126 forks source link

IP with CIDR not get Resolved in GEO Location script and stop with error #626

Open tgtshanika opened 5 years ago

tgtshanika commented 5 years ago

Description:

TID: [-1] [] [2017-11-06 22:44:12,580] INFO

{org.wso2.carbon.analytics.spark.admin.AnalyticsProcessorAdminService} - Started executing the script : APIMAnalytics-APIM_GEO_LOCATION_STATS-APIM_GEO_LOCATION_STATS-batch1 {org.wso2.carbon.analytics.spark.admin.AnalyticsProcessorAdminService}
TID: [-1234] [] [2017-11-06 22:44:12,640] INFO

{org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor} - Executed query: CREATE TEMPORARY TABLE APIGeoLocationData USING CarbonJDBC OPTIONS (dataSource "WSO2AM_STATS_DB", tableName "API_REQ_GEO_LOC_SUMMARY", schema "api STRING , version STRING , apiPublisher STRING , tenantDomain STRING , total_request_count INTEGER , year INTEGER , month INTEGER , day INTEGER , requestTime LONG , country STRING , city STRING ", primaryKeys "api,version,apiPublisher,year,month,day,country,city,tenantDomain" )
Time Elapsed: 0.055 seconds. {org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor}
TID: [-1234] [] [2017-11-06 22:44:12,734] INFO

{org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor} - Executed query: CREATE TEMPORARY TABLE APIRequestData USING CarbonAnalytics OPTIONS(tableName "ORG_WSO2_APIMGT_STATISTICS_PERMINUTEREQUEST", schema " year INT -i, month INT -i, day INT -i, hour INT -i, minute INT -i, consumerKey STRING, context STRING, api_version STRING, api STRING, version STRING, requestTime LONG, userId STRING, hostName STRING, apiPublisher STRING, total_request_count LONG, resourceTemplate STRING, method STRING, applicationName STRING, tenantDomain STRING, userAgent STRING, resourcePath STRING, request INT, applicationId STRING, tier STRING, throttledOut BOOLEAN, clientIp STRING, applicationOwner STRING, _timestamp LONG -i", primaryKeys "year, month, day, hour, minute, consumerKey, context, api_version, userId, hostName, apiPublisher, resourceTemplate, method, userAgent, clientIp", incrementalParams "GEO_APIMGT_PERMINUTE_REQUEST_DATA, DAY", mergeSchema "false")
Time Elapsed: 0.09 seconds. {org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor}
TID: [-1] [] [2017-11-06 22:44:30,159] WARN

{org.apache.spark.scheduler.TaskSetManager} - Lost task 2.0 in stage 356089.0 (TID 311137, platform-default-dasanalyzer-01): java.lang.Exception: Error while invoking method: getCountry, null
at org.wso2.carbon.analytics.spark.core.udf.adaptor.UDF1Adaptor.call(UDF1Adaptor.java:64)
at org.apache.spark.sql.UDFRegistration$$anonfun$register$25$$anonfun$apply$1.apply(UDFRegistration.scala:422)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:514)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: For input string: "70/32"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at org.wso2.carbon.analytics.shared.geolocation.impl.LocationResolverRdbms.getIpV4ToLong(LocationResolverRdbms.java:181)
at org.wso2.carbon.analytics.shared.geolocation.impl.LocationResolverRdbms.getLocationFromLongValueOfIp(LocationResolverRdbms.java:104)
at org.wso2.carbon.analytics.shared.geolocation.impl.LocationResolverRdbms.getLocationFromIp(LocationResolverRdbms.java:135)
at org.wso2.carbon.analytics.shared.geolocation.impl.LocationResolverRdbms.getLocation(LocationResolverRdbms.java:61)
at org.wso2.carbon.analytics.shared.geolocation.impl.GeoLocationResolverUDF.getCountry(GeoLocationResolverUDF.java:43)
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.wso2.carbon.analytics.spark.core.udf.adaptor.UDF1Adaptor.call(UDF1Adaptor.java:60)
... 21 more
{org.apache.spark.scheduler.TaskSetManager}

By @tharindu1st from https://wso2.org/jira/browse/ANLYAPIM-170

ruks commented 4 years ago

This needs to be verified with siddhi based IP to geolocation mapping library.