jarun / googler

:mag: Google from the terminal
GNU General Public License v3.0
6.1k stars 528 forks source link

Return non-zero exit code when no hits are returned #73

Closed spamwalter closed 8 years ago

spamwalter commented 8 years ago

I am automating searches and would like to detect failures without parsing the stdout. The grep command line tool allows for this by returning a non-zero exit code when it matches nothing. I would like this feature in googler.

Thank you!

(I love this software!)

jarun commented 8 years ago

2 scenarios considering your earlier feature request:

The non-zero exit code will apply to this scenario.

I love this software!

Please award a star if you find googler helpful :)

spamwalter commented 8 years ago

Right! Makes sense.

Writing you back on the other thing.

W.

On May 3, 2016, at 11:05 AM, Arun Prakash Jana notifications@github.com wrote:

2 scenarios considering your earlier feature request:

omniprompt enabled, googler will not exit omniprompt disabled The non-zero exit code will apply to this scenario.

I love this software!

Please award a star if you find googler helpful :)

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jarun/googler/issues/73#issuecomment-216558012

zmwangx commented 8 years ago

I don't think this is a valid concern. For any reasonable Google search, you should expect results (except for point 2 in notes; do let us know if you encounter that frequently, and we might consider fetching the next page). googler's job is to parse Google's results and print them. We should only fail when we fail to fetch results from Google, or fail to parse the results. (We do fail with nonzero exit code in those cases.)

Also, you pointed us to grep, but I'll also point you to find which returns zero even when nothing is found. There is no agreed-upon behavior for search-related tools, and I certainly can't think of a scenario where googler is used somewhat like grep -q. Please discuss a use case.

zmwangx commented 8 years ago

Now that I think about it, maybe you want to do site-specific searches and detect no-hit? That's reasonable, but I still don't think we should return a non-zero code when the execution is successful.

Also, please still discuss your exact use case. In general I don't think you should parse output from googler — it's not always consistent, and is subject to change without notice. Maybe we should entertain the idea of, say, JSON output. Then you can easily check for emptiness.

zmwangx commented 8 years ago

Another thing to keep in mind: googler can handle automation, but it's not designed for that purpose. If you spam requests in a narrow Window, Google will block you for several hours.

spamwalter commented 8 years ago

Hi!

Answering your various emails in one.

I apologize if I am not submitting these thoughts in a useful forum. I’ll try to do better in future.

I’m testing against particular file type. For example, I want to know 'Can I give a public example of a Python identifier named “runGoogler2”'? I limit ‘filetype' for this. I also run quoted and not quoted searches, but I only run the not quoted if I get no hits from quoted. Its a case where an exact hit is best, but a close hit might be good enough, so I run both.

I certainly agree that you should not parse output from googler. When I do get results, I want to review them by hand. Automating the searches just saves me time.

If I don’t have this feature, I MUST examine the output stream to figure out if I got hits.

Other tools that set an error code when nothing is found include:

pdfgrep DOS FIND will set the ERRORLEVEL to 0 if found, 1 if not found. DOS FC will set ErrorLevel as follows: -1 Invalid syntax (e.g. only one file passed) 0 The files are identical. 1 The files are different.

I understand your point about grep failing instead of just not matching, but different error codes could be (probably are?) used, as in FC above. For my purposes, I haven’t had any trouble with grep failing.

I have run into the Google limit, but I’ll just do the rest later.

You guys are heroes for writing this thing!

Thank you!

Walter.

On May 3, 2016, at 12:32 PM, Zhiming Wang notifications@github.com wrote:

Now that I think about it, maybe you want to do site-specific searches and detect no-hit? That's reasonable, but I still don't think we should return a non-zero code when the execution is successful.

Also, please still discuss your exact use case. In general I don't think you should parse output from googler — it's not always consistent, and is subject to change without notice. Maybe we should entertain the idea of, say, JSON output. Then you can easily check for emptiness.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jarun/googler/issues/73#issuecomment-216586600

spamwalter commented 8 years ago

Oops! One more thing.

Is there downside to returning the exit code? I don’t think it makes grep harder to use in normal cases.

Thanks,

W.

On May 3, 2016, at 12:53 PM, Walter Gibson Overby woverby@devonwoodlogistics.com wrote:

Hi!

Answering your various emails in one.

I apologize if I am not submitting these thoughts in a useful forum. I’ll try to do better in future.

I’m testing against particular file type. For example, I want to know 'Can I give a public example of a Python identifier named “runGoogler2”'? I limit ‘filetype' for this. I also run quoted and not quoted searches, but I only run the not quoted if I get no hits from quoted. Its a case where an exact hit is best, but a close hit might be good enough, so I run both.

I certainly agree that you should not parse output from googler. When I do get results, I want to review them by hand. Automating the searches just saves me time.

If I don’t have this feature, I MUST examine the output stream to figure out if I got hits.

Other tools that set an error code when nothing is found include:

pdfgrep DOS FIND will set the ERRORLEVEL to 0 if found, 1 if not found. DOS FC will set ErrorLevel as follows: -1 Invalid syntax (e.g. only one file passed) 0 The files are identical. 1 The files are different.

I understand your point about grep failing instead of just not matching, but different error codes could be (probably are?) used, as in FC above. For my purposes, I haven’t had any trouble with grep failing.

I have run into the Google limit, but I’ll just do the rest later.

You guys are heroes for writing this thing!

Thank you!

Walter.

On May 3, 2016, at 12:32 PM, Zhiming Wang <notifications@github.com mailto:notifications@github.com> wrote:

Now that I think about it, maybe you want to do site-specific searches and detect no-hit? That's reasonable, but I still don't think we should return a non-zero code when the execution is successful.

Also, please still discuss your exact use case. In general I don't think you should parse output from googler — it's not always consistent, and is subject to change without notice. Maybe we should entertain the idea of, say, JSON output. Then you can easily check for emptiness.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jarun/googler/issues/73#issuecomment-216586600

zmwangx commented 8 years ago

Yes, that's an understandable use case, but I certainly don't want to rework exit codes just to support this niche use case. Keep in mind that googler is not bullet-proof; it still has uncaught exceptions. We won't be able to control exit codes until we eliminate all uncaught exceptions, which IMO is an overkill.

Also, I did point to find(1). I won't comment on whether I like a zero exit status when nothing is found, but at least it's out there, and you're expected to check output sometimes.

Let's wait for @jarun's final word, but I would label this a wontfix, at least for now.

What I will do is to add a JSON output option (with @jarun's permission) to make output parsing easier.

spamwalter commented 8 years ago

Wonderful! I trust you guys to make the best decision for your tool.

Thank for listening.

W.

On May 3, 2016, at 1:00 PM, Zhiming Wang notifications@github.com wrote:

Yes, that's an understandable use case, but I certainly don't want to rework exit codes just to support this niche use case. Keep in mind that googler is not bullet-proof; it still has uncaught exceptions. We won't be able to control exit codes until we eliminate all uncaught exceptions, which IMO is an overkill.

Also, I did point to find. I won't comment on whether I like a zero exit status when nothing is found, but at least it's out there, and you're expected to check output sometimes.

Let's wait for @jarun https://github.com/jarun's final word, but I would label this a wontfix, at least for now.

What I will do is to add a JSON output option (with @jarun https://github.com/jarun's permission) to make output parsing easier.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jarun/googler/issues/73#issuecomment-216595781

jarun commented 8 years ago

--noninteractive makes sense. When I look up IMDB for a movie I fetch only 4 results and quit. Just wondering if we can make it smaller (--noprompt for example) and how about a short option?

The json option sounds much better. The reason is, if I run a query which has no results (which is a valid "search"), I get a valid result with 0 links. What are we expected to indicate in the exit code here?

zmwangx commented 8 years ago

See https://github.com/jarun/googler/pull/74#issuecomment-216611038.

spamwalter commented 8 years ago

(gnu) grep handles it this way:

“”” Normally the exit status is 0 if a line is selected, 1 if no lines were selected, and 2 if an error occurred. However, if the -q or --quiet or --silent option is used and a line is selected, the exit status is 0 even if an error occurred. Other grep implementations may exit with status greater than 2 on error. “””

You could add error numbers as you discover reasonable error cases.

Thanks,

W.

On May 3, 2016, at 1:43 PM, Arun Prakash Jana notifications@github.com wrote:

if I run a query which has no results (which is a valid "search"), I get a valid result with 0 links. What are we expected to indicate in the exit code here?

zmwangx commented 8 years ago

As I said we don't have full control over exit status yet (and may never do), so conditional exit status isn't much of an option for us.

jarun commented 8 years ago

Had it been an API I would have rested with returning the number of results fetched. However, this is an exit code and I don't see why grep should be treated as definitive. The success and failure of each program is subjective.

@spamwalter do you think that checking json output wouldn't help in your case?

zmwangx commented 8 years ago

JSON output in #75.

Closing this as wontfix.

spamwalter commented 8 years ago

The subprocess python module can’t react automatically to JSON output. It can to an exit code. Most scripting and shell environments speak exit codes as their native language.

I don’t mean to beat anyone over the head with grep (and DOS FIND, and DOS FC). I’ll say four things.

  1. Grep’s idiom is easy for me to use for the above reason.
  2. It is consistent with my personal (perhaps idiosyncratic) expectations.
  3. It can’t be too broken since grep has been around for a while.
  4. I’m not sure what the down side would be. Environments wouldn’t have to acknowledge it. I can’t think of what would break as a result

Now, I don’t have to write the code, so I’ll leave it at that.

I know you guys will make the right decision for your project.

Thanks for listening.

W.

On May 3, 2016, at 2:03 PM, Arun Prakash Jana notifications@github.com wrote:

Had it been an API I would have rested with returning the number of results fetched. However, this is an exit code and I don't see why grep should be treated as definitive. The success and failure of each program is subjective.

@spamwalter https://github.com/spamwalter do you think that checking json output wouldn't help in your case?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jarun/googler/issues/73#issuecomment-216614055

zmwangx commented 8 years ago

Nonzero exit code means failure. That's it. I love grep -q too, but that's a stretch.

We don't consider it a failure when a search returns zero results. Simple as that.

Most scripting and shell environments speak exit codes as their native language.

Not really. Exit status is nice, but Unix philosophy: "Write programs to handle text streams, because that is a universal interface." That's the spirit of the shell: whipuptitude, stitch things together. We have given you a JSON interface (or rather, will give you one very soon). Use that.

zmwangx commented 8 years ago

Plus the fact that googler is meant to be an interactive tool in the first place.

zmwangx commented 8 years ago

Locking this conversation because

  1. I've made my stance and reasoning very clear;
  2. It's usually counterproductive to argue in a closed, wontfix issue.
jarun commented 8 years ago

@spamwalter I have put forward my arguments earlier and am fine with not fixing this issue. You have to work around this with the non-interactive mode and jason output support. That's the best we can do.

A quick solution seems to be a wrapper script around googler that does the json parsing and returns an error exit code in case of no results.

Thanks for your understanding.

zmwangx commented 8 years ago

Forgot to mention in this thread that we have shipped JSON output (--json) in cd74e6783603eb1088aa7a24de90f5f0ebae4cb9. The interface should be stable, but there's no guarantee.