Closed TaufeeqNoamaan closed 7 months ago
Hey @TaufeeqNoamaan,
The error message indicates that you don't pass the key
field to Pathway. Indeed, there's only text
field in the message.
In you code, you do self.next_json({"text": comment})
so only text
field is passed to Pathway. Please add the key
field here as well and it should work.
Please let me know if that helps.
Hey @KamilPiechowiak
I rectified it, but still the same error persists
def run(self) -> None: top_comments = self._reddit_client.get_top_comments() for idx, comment in enumerate(top_comments, start=1): self._subject.next_json({"key": idx, "text": comment}) yield {"key": idx, "text": comment}
The dashboard won't open now
Please paste the error message. Are the error messages being logged for each input message again?
After including the "key" param, I'm not getting any error logs as such. The dashboard won't open and the file just executes and terminates
Does it terminate with an error?
Did you check if the messages are written to table.jsonlines
after executing the program?
Nope, doesn't terminate with an error and no messages are written to jsonlines files.
Also tried output with csv, same thing.
Your run
method should look like this:
def run(self) -> None:
top_comments = self._reddit_client.get_top_comments()
for idx, comment in enumerate(top_comments, start=1):
self.next_json({"key": idx, "text": comment})
In the code you pasted, you have self._subject.next_json
which means that your method is probably outside of RedditSubject
and it is impossible for me to say what's wrong.
Please paste the full version of your code (and please take care of properly formatting the code).
You are right @KamilPiechowiak, it was indeed a problem in the scope of the method.
Issue is solved!
Thanks a lot!!
I'm glad that your problem is solved.
Steps to reproduce
I'm trying to connect reddit with pathway
This is the code `class RedditClient: def init(self, keyword, client_id=, client_secret= , user_agent='pathway::1.0 (by /u/)'):
self.reddit = praw.Reddit(client_id=client_id, client_secret=client_secret, user_agent=user_agent)
self.keyword = keyword
class RedditSubject(pw.io.python.ConnectorSubject): _reddit_client: RedditClient
class InputSchema(pw.Schema): key: int = pw.column_definition(primary_key=True) text: str
input = pw.io.python.read( RedditSubject("python"), schema=InputSchema, autocommit_duration_ms=1000, ) pw.io.jsonlines.write(input, "table.jsonlines") pw.run()`
Reddit client_id and secret can be obtained from here: https://www.reddit.com/prefs/apps
Relevant log output
What did you expect to happen?
The data should've been written to the json file
Version
0.8.5
Docker Versions (if used)
No response
OS
Linux
On which CPU architecture did you run Pathway?
x86-64