Closed moollaza closed 8 years ago
I'm not sure I understand the rationale of this being in duckpan. You can pull out the keys, or just dump them in a debug statement in your fathead script, and you know everything it will trigger on exactly.
I'm not sure I understand the rationale of this being in duckpan. You can pull out the keys, or just dump them in a debug statement in your fathead script, and you know everything it will trigger on exactly.
@zachthompson I'm more interested in getting it to work in DuckPAN Server so developers can actually see what the results will look like. I can look at the output.txt to get an idea of what content I'm showing, but being able to see the result is a big bonus and means you don't have to wait for us to deploy it to Beta to see it working. In fact, you don't even really need to write any code if you just manually create an output.txt and load it into DuckPAN for testing. If we add support for more Fathead templates this will become even more useful.
I agree that the duckPAN query integration doesn't offer as much benefit, but there's still something I enjoy about working with a REPL to see that the queries I'm thinking about trigger and return results. IMO looking at the data in aggregate can be a little overwhelming.
@zachthompson I agree with @moollaza on duckpan server
- many developers may wish to observe the Instant Answer in an environment similar to what they would see on production; and Beta has a huge reliance on deploys. If we want to increase the number of developers working on FatHead Instant Answers I think this is a good move.
Is this ready to start experimenting with? How can we help you test it?
I think there are a bunch of proposed fatheads in the pipeline we can start using as test load :)
@moollaza Could we use some of the information from the IA metadata (and ID) to generate some of the paths?
For example, rather than having to specify duckpan server -o lib/fathead/perl6_doc/output.txt
you would just say duckpan server Perl6Doc
or duckpan server perl6_doc
and it would figure the rest out. Admitedly FatHeads are a bit of an edge-case with a different directory structure to Goodies and Spice, but this should be easily do-able.
Edit You mention not binding FatHeads to metadata - but we do have metadata for FatHeads (DDG::Meta::Data->filter_ias({repo => 'fathead'})
) regardless of whether or not they have a Perl module; why would this be an issue?
Is this ready to start experimenting with?
@jbarrett Yes! I've been using for the past week or so. At a very basic level it's working, but without the connection to Metadata it's not as good as it could be.
How can we help you test it?
Try testing any open Fathead PR's with this to see what works and what doesn't. I don't love specifying the output.txt path and would like to improve this
Could we use some of the information from the IA metadata (and ID) to generate some of the paths?
@GuiltyDolphin that's what I'd like to do, and your suggestion is exactly what I had in mind. It just feels weird specifying the name of a Perl Module which doesn't actually exist. IMO we need a better way to connect Fathead files in the repo to their metadata, because we're relying on the name of an imaginary Perl module which doesn't exist in the repo or anywhere else.
We should probably change "Perl Module" on the IA Page to "Directory" or similar which maps to a dir in https://github.com/duckduckgo/zeroclickinfo-fathead/tree/master/lib/fathead -- that's something real that the developer has knowledge of
@zachthompson @jbarrett @jdorweiler @GuiltyDolphin what are your thoughts on that?
I just realized we assume the /lib/fathead/
dirs are named according to the Fathead IDs. That should make things easier 👍
I've now added support for Metadata and remove the -o
option.
Now you just specify the Fathead ID as the last argument:
e.g. duckpan server mdn_css
We're operating on the assumption that a dir exists in /lib/fathead/
with the same name which means we can easily check for the existence of an output.txt
file in the dir
@moollaza This appears to be triggering on everything now? If I try foo
with the perl_doc
FatHead it returns a result with all the meta information but no text (as there is no field for foo
).
This appears to be triggering on everything now?
Yup, there was a bug. Should be all fixed now 👍
I think this is pretty good for now. It does not yet support Category Pages, but AFAICT no open source Instant Answers use Category pages.
I haven't added support for External Links (infobox) yet either, but will once the mdn_css Fathead starts using them. I don't think any other IAs currently use the external links.
Added a few more commits and updated the PR description
@moollaza Getting a Can't locate object method "_get_image" via package "App::DuckPAN::FatHead" .../lib/App/DuckPAN/FatHead.pm line 173.
in duckpan query
.
@GuiltyDolphin whoops, didn't mean to change that function's name yet. Fixed
Looks like the URL is not showing in the "More at ..." even thought it can read it in DuckPAN:
For some reason I was working off the impression this had been released and opened an issue: #365 - Once you've gotten a match, that match will always display until the next match.
I thought it had been released too :)) It took me an embarrassingly long time to figure that out.
@moollaza No support for related topics? Any chance we could get support for this soonish?
No support for related topics? Any chance we could get support for this soonish?
@GuiltyDolphin I don't believe any existing IAs use it so I didn't have anything to test with. I was going to add support once the in-progress MDN CSS Fathead started using them 😉
I thought it had been released too :)) It took me an embarrassingly long time to figure that out.
If everyone's already using this, then it sounds like there aren't any major issues with it?
@zachthompson care to merge? Happy to keep refining this as necessary.
@moollaza Okay, well it's not a killer atm, but we're getting a fair few FatHeads now which may wish to use related topics, so it should definitely be one of (if not the) next features to be implemented.
Otherwise, this looks good to merge - I've been using it a fair bit and it seems to work pretty well.
Merging with @zachthompson's approval 👍
Woo!
/cc @zachthompson @jbarrett @jdorweiler
These changes add support to
duckpan query
andduckpan server
for Fathead features used by open source Fatheads.You can query the exact names of articles in the Output.txt file to see the result that will display. If an article is a redirect, it will follow the redirect and show the correct result. Disambiguations and Abstract Images are also supported.
Category Pages and External Links (infobox) support has not been implemented yet. But I plan to implement the latter soon. Also Fathead Triggers are not yet supported but I will add that soon as well, using the IA Page triggers. However, internally I don't believe we consider the triggers defined on the IA page (though we should!)
This approach ignores the DDG infrastructure and request handlers, which Spice and Goodies use. That system is heavily dependant on loading Fathead Perl modules into Blocks, but we removed the Perl modules long ago as they were only there to contain Fathead Metadata.
One other missing feature is the internal processing of the output.txt -- Internally we process each line of the output.txt and generate several variations of the Article title, e.g. for "background-color" we also create "background color" and "backgroundcolor".
I noticed that some Fathead parsing scripts implement this functionality themselves. We should probably make it known to developers that we'll do this work for them to simplify the parsers a bit.
Any feedback is welcome. I've not used DBI before so I wasn't sure about the best way to get the output.txt into a DB for searching. I've used DBD::CSV which interprets the output.txt (tab delimited) as a CSV and pulls it in to memory for SQL querying. I'm not sure how well this will work for very large output.txt files. We can easily test this though.
How to Test
git checkout
this branchgit clone git@github.com:duckduckgo/zeroclickinfo-fathead.git
$ duckpan -I ../p5-app-duckpan/lib/ server <instant_answer_id>
e.g.$ duckpan -I ../p5-app-duckpan/lib/ server mdn_css
To view the changes made while ignoring whitespace changes: https://github.com/duckduckgo/p5-app-duckpan/pull/358/files?w=0