Closed ronentk closed 3 months ago
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@ronentk I merged and tried running main.test.py but I get the error
2024-06-28 15:26:07.799 | INFO | shared_functions.parsers.post_parser_chain:__init__:27 - Initializing parser chain 'hashtags'
2024-06-28 15:26:07.834 | INFO | shared_functions.main:SM_FUNCTION_post_parser_imp:30 - Running parser on content: {'thread': [{'content': 'This is an interesting paper https://arxiv.org/abs/2312.05230 but I disagree with its sequel https://anotherlink.io #user-hashtag'}], 'author': {'platformId': <enum 'SocialPlatformType'>, 'id': '12345', 'username': 'johndoe', 'name': 'John Doe'}}...
Traceback (most recent call last):
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/main.test.py", line 35, in <module>
result = SM_FUNCTION_post_parser_imp(thread_data, parameters, config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/shared_functions/main.py", line 33, in SM_FUNCTION_post_parser_imp
result = parser.process_text(
^^^^^^^^^^^^^^^^^^^^
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/shared_functions/parsers/multi_chain_parser.py", line 288, in process_text
ref_post: RefPost = convert_text_to_ref_post(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/shared_functions/schema/helpers.py", line 12, in convert_text_to_ref_post
urls = extract_and_expand_urls(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/shared_functions/utils.py", line 165, in extract_and_expand_urls
orig_urls = extract_urls(text)
^^^^^^^^^^^^^^^^^^
File "/home/pepo/pr/cs/sensemakers/app/firebase-py/functions/shared_functions/utils.py", line 137, in extract_urls
res = re.findall(url_regex, text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pepo/anaconda3/lib/python3.11/re/__init__.py", line 216, in findall
return _compile(pattern, flags).findall(string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'dict'
Also, I removed the TypedDict
interface that I created that it seems you ended up merging. Its better to have just one spec of these types and yours should be the ones.
Check the latest commit on https://github.com/Common-SenseMakers/sensemakers/tree/merge-nlp-dev
Adding support for threads
102
Update research filter classification to support multiple types
103
105 Add optional token length limit for parser inputs
Quote tweet support
89 , #94 , #98
99
citoid mis-detects non twitter domain