microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
9.58k stars 878 forks source link

[BUG] pf run create - generates snapshot with all files in project #2418

Closed sashokbg closed 6 months ago

sashokbg commented 8 months ago

Describe the bug In my project I have a couple of very big directories such as .venv and models. When running pf run create, it generates a snapshot that contains the entire dir tree of my project wasting huge amounts of disk space.

Expected behavior Maybe add an ignorefile like .gitignore ?

Screenshots

du -d 1 ~/.promptflow/.runs/first_run -h       
240K    /home/alexander/.promptflow/.runs/first_run/node_artifacts
25G /home/alexander/.promptflow/.runs/first_run/snapshot
96K /home/alexander/.promptflow/.runs/first_run/flow_artifacts
12K /home/alexander/.promptflow/.runs/first_run/flow_outputs
25G /home/alexander/.promptflow/.runs/first_run

Running Information(please complete the following information): promptflow 1.6.0

Executable '/home/alexander/Games2/degiro-faq-assistant/.venv/bin/python' Python (Linux) 3.11.7 (main, Jan 29 2024, 16:03:57) [GCC 13.2.1 20230801]

Linux alexander-desktop 6.6.16-2-MANJARO #1 SMP PREEMPT_DYNAMIC Sat Feb 10 09:40:02 UTC 2024 x86_64 GNU/Linux

wangchao1230 commented 8 months ago

Hi @sashokbg , thanks for reporting this to us.

I think we should make two enhancement on this:

We will let you know when we have this worked out for you. For workaround, is it possible to move those big directories out of the flow folder?

sashokbg commented 8 months ago

Hello @wangchao1230 thank you for your prompt reply. Yes I can manage a workaround for now :)

Have a nice day !

sashokbg commented 8 months ago

@wangchao1230 can you point me at where to look in the code please ? I might have enough time to try to add the gitignore part

wangchao1230 commented 8 months ago

Our exisitng logic is like this: https://github.com/microsoft/promptflow/blob/0ae6a0aa36dd900fcfdd67b1a868a1d88c6fb964/src/promptflow/promptflow/_sdk/_utils.py#L449

github-actions[bot] commented 7 months ago

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!

sashokbg commented 7 months ago

Hello @wangchao1230 so there is already ignore logic but it seems it is not working ?

wangchao1230 commented 7 months ago

Yes. Current implementation might has some limitation like only looking at ignore file in code folder, not search up to parent folders.

github-actions[bot] commented 6 months ago

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!