SignifAi / Spark-PubSub

Google Cloud Pubsub connector for Spark Streaming
Apache License 2.0
16 stars 3 forks source link

404 Resource not found (resource=pyspark_subscription). #6

Open brunopistone opened 4 years ago

brunopistone commented 4 years ago

Hi, I'm trying to use your library for creating a pyspark streaming pipeline using pub/sub. I've followed your guide and wrote this code:

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = \
        os.path.join(os.curdir, "config", "google", "sa-pubsub.json")

config = SparkConf()\
        .setAppName("streaming_pipeline") \
        .setMaster("local[1]") \
        .set("spark.jars", "jars/spark_pubsub-1.1-SNAPSHOT.jar")

spark_context = SparkContext(conf=config)

spark_context.setLogLevel("ERROR")

streaming_context = StreamingContext(spark_context, 1)
streaming_context.checkpoint("streaming_checkpoints")

socket = PubsubUtils.createStream(
        streaming_context,
        "projects/project_id/subscriptions/pyspark_subscription",
        5,
        False
    )

messages = socket \
        .flatMap(transform.flat_map) \
        .map(transform.map_function) \
        .updateStateByKey(updateFunc, initialRDD=initialStateRDD)

messages.pprint()

streaming_context.start()
streaming_context.awaitTermination()

But I receive this error:

Deregistered receiver for stream 0: Restarting receiver with delay 2000ms: Error while fetching messages from pubsub for subscription projects/project_id/subscriptions/pyspark_subscription - com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
  "code" : 404,
  "errors" : [ {
    "domain" : "global",
    "message" : "Resource not found (resource=pyspark_subscription).",
    "reason" : "notFound"
  } ],
  "message" : "Resource not found (resource=pyspark_subscription).",
  "status" : "NOT_FOUND"
}

Can you help me? Thank you

anguillanneuf commented 3 years ago

@bp91 Not sure if you unblocked yourself on this one already. I was able to get messages using a similar script as yours from my Pub/Sub subscription. The NOT_FOUND error sounds more like an error in your Pub/Sub setup. Were you trying to access the subscription in the same project as the service account key stored in GOOGLE_APPLICATION_CREDENTIALS?