acl-org / acl-anthology

Data and software for building the ACL Anthology.
https://aclanthology.org
Apache License 2.0
428 stars 284 forks source link

Feedback on the new Anthology website #170

Closed mbollmann closed 11 months ago

mbollmann commented 5 years ago

This thread is intended to collect all feedback, suggestions, bug reports, etc. for the new Anthology website in the static-rewrite branch.

(Edit: live demo here at http://aclweb.org/anthology)

If you do not have a GitHub account, you're also welcome to send me feedback via e-mail (marcel@bollmann.me) or Twitter (\@mmbollmann)!

Known Issues

akoehn commented 5 years ago

I really like it, especially the speed!

There is a display:none span containing the text "bib" in the bibtex block inside the acl-paper-link-block block. When using a text browser, this leads to the text being BibTeXbib. That span should be removed.

As a minor comment: Could you specify the hardware requirements for building the anthology a bit? How much time & memory does building take? "a considerable amount of memory" could be 8GB or 512, depending on whom you ask :-)

davidweichiang commented 5 years ago

It looks great! On Safari, when you click on on pdf/bib link and then click the browser's back button, the little callout ("Open PDF" or "Export BibTeX") remains on.

danielgildea commented 5 years ago

awesome!!!!!!!!!

texttheater commented 5 years ago

I think it would look better if the header had the same width as the content. I.e., the ACL logo would move to the left and the search box to the right, in order to align with the content.

desilinguist commented 5 years ago

Looks awesome! Great work! 👏

mjpost commented 5 years ago

What's the reason for inserting newlines in the bib field values? (for example, in booktitle here, and titles elsewhere).

stevenbedrick commented 5 years ago

Disclaimer: This is about search, but is not about weird search behavior as such. Is Google Custom Search the long-term search solution for the new version of the Anthology? It is inherently waaaaay less functional than the existing search system on the current Anthology- for example, the current search page has really great result faceting, etc.

stevenbedrick commented 5 years ago

And I just saw #165 - glad to see that something more flexible is on the roadmap/radar. In the meantime, we could also link to the DFKI "ACL Anthology Searchbench".

aryamccarthy commented 5 years ago

On mobile, the magnifying glass of the search bar gets forced to the next row for me.

aryamccarthy commented 5 years ago

Is the BibTeX generation handling special characters properly?

This entry has weird quotation marks in the abstract. http://aclweb.org/anthology/papers/C/C18/C18-1137.bib This one has weird things going on in the title field. http://aclweb.org/anthology/papers/K/K18/K18-3001.bib

danielhers commented 5 years ago

When there is just one paper in a conference, the noun after the number should be singular "paper" and not "papers". Example: Proceedings of the Pilot SENSEVAL 1 papers in http://www.aclweb.org/anthology/venues/semeval/

rahular commented 5 years ago

Awesome work! One small issue I saw is that when I am browsing through papers in pages like this, there is no way for me to scroll back to the top instantly. The up button which is present at the beginning of the page could be floating around a corner.

danielgildea commented 5 years ago

Is the BibTeX generation handling special characters properly?

Fixed by [6bbc5a1f4f35744f609f384e866a95bf6cc8f021]

davidweichiang commented 5 years ago

Re: https://github.com/acl-org/acl-anthology/issues/170#issuecomment-471835229, when I view in Chrome or iOS Safari, I see mojibake, but on macOS Safari, it looks fine.

Although @danielgildea's fix puts the .bib file into ASCII (as it should be), I wonder if, as a failsafe, can the server put Content-Type: application/x-bibtex; charset=utf-8 into the response header?

danielgildea commented 5 years ago

What's the reason for inserting newlines in the bib field values? (for example, in booktitle here, and titles elsewhere).

anth2bib.py is just passing through newlines that are in the titles in the xml files.
I can't figure out where they come from originally. Personally, I think they make the bibtex more readable anyway.

anth2bib.py does insert newlines between author names. I think this makes it more readable, especially when names are in "Last, First" format.

aryamccarthy commented 5 years ago

Is the BibTeX generation handling special characters properly?

Fixed by [6bbc5a1]

I'm seeing "CoNLL–SIGMORPHON" in macOS Safari, instead of "CoNLL–SIGMORPHON". Does the build script need to be re-run to show the fix?

mbollmann commented 5 years ago

Does the build script need to be re-run to show the fix?

Absolutely. Fixes are not reflected on the live website until @mjpost rebuilds it and pushes it there.

mjpost commented 5 years ago

I agree the one-line-per-author variant is more readable and is fine with me, as long as we make sure to use spaces and not tabs (per #16).

I'll rebuild soon, by tonight at the latest. Once we have continuous integration checks built (#102) and other checks against commits to the master branch, we can have it automated.

mbollmann commented 5 years ago

Thanks for all the feedback so far! I've implemented a bunch of minor layout fixes based on the comments here (with the same caveat as above: will not be live until Matt rebuilds).

Disclaimer: This is about search, but is not about weird search behavior as such. Is Google Custom Search the long-term search solution for the new version of the Anthology? It is inherently waaaaay less functional than the existing search system on the current Anthology- for example, the current search page has really great result faceting, etc.

I believe Google Custom Search is much more powerful than people give it credit for, and it offers customization options that should allow for similar result faceting and features as before. However, that requires some more work on my part, and it wasn't really possible to implement and test this earlier as, by its very nature, it requires the new site to be live and getting indexed by Google first.

I'd really like to advocate for some more patience here over the coming weeks as I'm hoping to improve this. Maintaining a custom-made search solution is a huge liability IMO, and I would really like for people to give the Google version a fair chance first.

stevenbedrick commented 5 years ago

@mbollmann That's totally fair, and thank you for the reply. I certainly see the value of using an off-the-shelf/hosted search platform in general, and also of using Google Custom Search in particular as a "getting things up and running" solution. For the sake of clarity, my concerns are less about the search behavior of GCS- if anybody can build a decent text search engine, it'd be Google! My concerns are more about search UI/UX- result faceting, etc. I'm happy to give GCS more of a chance, and am looking forward to seeing what we're able to do with GCS in terms of customization. Thank you (all of you!) for your efforts on this project; I do very much like the redesign overall and am excited to see it evolve!

mjpost commented 5 years ago

Okay, rebuilt. I also merged in master which had some corrections.

aryamccarthy commented 5 years ago

Unclear whether this is a parsing error or a data error: this BibTeX has no article title.

mjpost commented 5 years ago

Thanks! The title appears in the HTML: (http://aclweb.org/anthology/D13-1088/) and is in the XML, so I'm not sure what's going on here.

mbollmann commented 5 years ago

Thanks! The title appears in the HTML: (http://aclweb.org/anthology/D13-1088/) and is in the XML, so I'm not sure what's going on here.

Pretty sure it's related somehow to the title starting with <fixed-case>. It's fixed with the refactored BibTeX generation in 7cd20c3.

mjpost commented 5 years ago

Ah, I was looking at the master branch. i’ll rebuild tonight.

mjpost commented 5 years ago

Done, and the problem is indeed fixed. Thanks!

aryamccarthy commented 5 years ago

Another question of is-it-the-data-or-the-site:

SIGMORPHON has workshops listed through 2014, but one of the 2014 ones is really 2016: W16-20. On top, their 2018 workshop isn't listed on their page.

mbollmann commented 5 years ago

Another question of is-it-the-data-or-the-site:

SIGMORPHON has workshops listed through 2014, but one of the 2014 ones is really 2016: W16-20. On top, their 2018 workshop isn't listed on their page.

It's the data: https://github.com/acl-org/acl-anthology/blob/static-rewrite/import/sigmorphon.yaml

The W16-20 one is tagged as 2014 there, and there's no entry for 2018. You can submit a PR or I can fix it sometime later.

mjpost commented 5 years ago

@aryamccarthy, could you submit a PR (against the static-rewrite branch)?

nschneid commented 5 years ago

Clarification question: Which will be the permanent home of the anthology, aclweb.org/anthology or aclanthology.info? Re: https://github.com/zotero/translators/issues/1702#issuecomment-475880213

mbollmann commented 5 years ago

Clarification question: Which will be the permanent home of the anthology, aclweb.org/anthology or aclanthology.info? Re: zotero/translators#1702 (comment)

The former (aclweb.org/anthology). There's already an issue (#178) to add 301 redirects from the aclanthology.info site, they're just not up yet.

Evpok commented 5 years ago

Quick question (I'll open a dedicated feature request if people are interested): is there any hope of supporting BibLaTeX exports? The data model of the default styles include keys like eventdate, eventtitle and eventtitleaddon that could be used for the confs' date, full name and short name repectively, booksubtitle that would allow to remove the volume title from the proceedings title and so on. That would make for cleaner metadata.

mjpost commented 5 years ago

Would you need individual files for every paper, or would it be sufficient to dump a single BibLaTeX file for the entire anthology? We current do both for BibTex (you can get the whole archive here [5 MB]).

Evpok commented 5 years ago

Ideally for every paper for feature parity with the BibTeX version, but I understand it could be impractical, so a single one for the whole anthology would be a good start :-) (at least for the next 10 years until me and other proselytes manage to kill BibTeX for good). I can volunteer for doing the work by the way, whatever solution you chose

GabrielLin commented 5 years ago

I argue you to get back the EndNote importer file. Thanks.

mjpost commented 5 years ago

Hi @GabrielLin,

Thanks for the feedback. We have created a new issue to track this (#235), and have a question for you there.

seeledu commented 5 years ago

I‘m a Chinese user, I can't use the Search Results.

seeledu commented 5 years ago

I‘m a Chinese user, I can't use the Search Results.

and when i use the VPN, it works well.

akoehn commented 5 years ago

@seeledu, does google.com work without VPN? If not, that's the problem :-/

knmnyn commented 5 years ago

Cross-ref #244 Google products are blocked by the Great Firewall of China occasionally (more often than not). This is why relying on G products for international services that have a large membership in CN (i.e., ACL) is not usually a good idea.

As ACL is an international organization, we may want to think more about this issue (and maybe get the official word from the ACL Exec). Previously the Rails search allowed a pretty comprehensive search within the system, but now perhaps searching the static site would be easier in some ways.

Franck-Dernoncourt commented 5 years ago

On the author page, e.g. https://www.aclweb.org/anthology/people/y/yan-song/, we can only see the count for the top 5 most frequent venues (it used to be possible to view the count for all venues):

image

yucc2018 commented 5 years ago

papers(PDF full) in the following pages can't be downloaded. https://www.aclweb.org/anthology/events/cl-2018/

malikalamgirian commented 5 years ago

Where can I find the appendices? I could not find the appendices to the papers anywhere for EMNLP 2018.

mbollmann commented 5 years ago

Where can I find the appendices? I could not find the appendices to the papers anywhere for EMNLP 2018.

Supplementary material can be accessed via the green buttons. For EMNLP 2018, they all seem to be labeled "Attachment". If that doesn't answer your question, can you clarify?

malikalamgirian commented 5 years ago

I still could not find the attachments for this paper https://aclweb.org/anthology/papers/D/D18/D18-1514/

It mentions on page 4805 that Appendix A.2 contains some related materials, but I still could not find the Appendix or the attachments anywhere.

mbollmann commented 5 years ago

I still could not find the attachments for this paper https://aclweb.org/anthology/papers/D/D18/D18-1514/

There don't seem to be any additional files for this submission on the server, so my first guess would be that despite what they write in the paper, the authors didn't actually provide an appendix. Maybe someone involved in ingesting EMNLP 2018 can dig into this more, but my assumption is that you'd have to contact the authors.

mjpost commented 5 years ago

Yes, it seems these authors forgot to submit it. If you contact them, you could let them know they could send it to me for uploading by creating an Issue here.

You can see what attachments look like on other papers, e.g., https://aclweb.org/anthology/papers/D/D18/D18-1512/

yucc2018 commented 5 years ago

can anyone supply NAACL 2019 papers?

mjpost commented 5 years ago

NAACL 2019 papers will be available in the Anthology on on June 2, 2019.

aryamccarthy commented 5 years ago

The main page of the anthology is slightly wider than all other pages. Moving to one of those other pages means the top bar contents jump closer together.