block / goose

Goose is a developer agent that operates from your command line to help you do the boring stuff.
https://block.github.io/goose/
Apache License 2.0
108 stars 17 forks source link

fix: just adding stuff from developer.py to synopsis developer #182

Closed michaelneale closed 1 day ago

michaelneale commented 1 day ago

This was stuff that didn't make it over yet

lamchau commented 1 day ago

ahh, didn't realize about licensing. is there an automation we can add to audit/check these things? i'm not very well versed in the legalese to know

On Tue, Oct 22, 2024 at 18:32 Michael Neale @.***> wrote:

@.**** commented on this pull request.

In src/goose/synopsis/toolkit.py https://github.com/block/goose/pull/182#discussion_r1811635410:

  • Args:
  • url (str): url of the site to visit.
  • Returns:
  • (dict): A dictionary with two keys:
    • 'html_file_path' (str): Path to a html file which has the content of the page. It will be very large so use rg to search it or head in chunks. Will contain meta data and links and markup.
    • 'text_file_path' (str): Path to a plain text file which has the some of the content of the page. It will be large so use rg to search it or head in chunks. If content isn't there, try the html variant.
  • """ # noqa
  • friendlyname = re.sub(r"[^a-zA-Z0-9]", "", url)[:50] # Limit length to prevent filenames from being too long
  • try:
  • result = httpx.get(url, follow_redirects=True).text
  • with tempfile.NamedTemporaryFile(delete=False, mode="w", suffix=f"_{friendly_name}.html") as tmp_file:
  • tmp_file.write(result)
  • tmp_text_file_path = tmp_file.name.replace(".html", ".txt")
  • plain_text = re.sub(
  • r"<head.?>.?|<script.?>.?|<style.?>.?|<[^>]+>",

it is GPL (v3) so a no go (already looked at that)

— Reply to this email directly, view it on GitHub https://github.com/block/goose/pull/182#discussion_r1811635410, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFPCKETXZ3HIS6ESE26NH3Z434DNAVCNFSM6AAAAABQNSKXZSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDGOBWGY4TKOJVGU . You are receiving this because you commented.Message ID: @.***>

michaelneale commented 1 day ago

@lamchau yes! https://github.com/block/goose/pull/184 - can do it that way