Feedback for Group Zeta

lbechberger / ML4NLP

Material for the Practical Seminar "Machine Learning for Natural Language Processing"

MIT License

2 stars 4 forks source link

Feedback for Group Zeta #9

Open lbechberger opened 5 years ago

lbechberger commented 5 years ago

This is the thread where all the other groups leave their feedback for the documentation of group Zeta.

pphilihpp commented 5 years ago

Hello Group Zeta,

this is our feedback for your documentation of the last week. Overall you explained your project goal clearly and sticked to the most relevant information without missing the important key points. Others can mostly understand, what you are planning to do in this project.

In our opinion, you used all terms correctly and did not provide wrong information. But some of your statements (e. g. "... assigns probabilities regarding their relevance ...") are a little bit vague and in the future you could maybe explain those statements more in detail.

You used your terms consistent, so that the reader can easily follow your description. We see further possibilities for an improvement by linking more information into your project, e. g. relating your ideas to other paper, projects, or literature and by perhaps providing external links to the tools you are using, like SPARQL. That helps the reader to inform hiself about the tools easier and to understand the overall project better.

In our opinion, your design decisions are described partly very good (e. g. to which questions your limit your system and why you do not include questions like “How did Donald Trump become 45th President of the United States?“), but other parts are a little bit missing (e. g. why do your approach requires three different classifiers or how do you choose your classifier). Maybe you could explain those parts more in detail in the future.

Finally, your style is very good and and you should keep this simple to read but at the same time precise writing style. Due too the division into different sections with own topics, your text is well structured and the example question you have provided are clearly helping to understand your approach better.

Try to include our ideas for an improvement in the future and keep up the good work!

apukropski commented 5 years ago

Review Week 3 Your dataset description gives a good overview of how you want to create your triple-dataset. It was easy to read because you sectioned your text in essential paragraphs and your titles matched the content. We really liked the idea of listing the properties of your autogenerated dataset in the end as a summary. After reading your explanation of the data generation, we only have a small uncertainty left: will the end-user be able to ask questions to only one given article (i.e. after reading it) or will he/she be able to query over all the articles in a given category or even over the whole database? It would be great if you could clarify this a bit more.

All in all, your well structured text offers enough examples to understand how you plan to acquire your dataset, we did not find any contractions and you provided references to the sources you used.

ljscfo commented 5 years ago

First of all, the latest part of your documentation makes a really good impression. It is nicely conclusive and you don't shy away to go into some technical details. Nevertheless, some minor things that might be worth improving: About the structure: You could improve it with subheadings or at least an additional heading to "Week 4". Maybe name it "Data set creation workflow" or more in detail "Triple generation from articles". Two things on examples: It's great that you give an example for the questions you consider being asked about the given triple. Nevertheless, it appears it bit out of sudden for the introductory sentence is some sentences above in the paragraph. Maybe write something like "Here is an example for a triple and corresponding questions:" Additionally, you could extend the example for what happens during steps 2-4, because it seems to be the core-idea for the data set creation. One thing about consistency: I'd suggest to make the bullet points of your enumeration all sentences. (Or stay with non-sentences, but it would be pretty hard for the last bullet point.) Another thing about accessibility: It is nice that you introduce abbreviations like NER and RDF by writing it out and adding the abbreviation in parentheses. You should do the same for the abbreviations POS and nltk (just for formal consistency - of course, everyone understands nltk because of the website-link you provided). One little thing about grammatical accuracy: Add a dot between "...applied" and "The product..." in the fourth bullet point.

mpoemsl commented 5 years ago

Feedback for Week 5: First of all, the structure of your documentation was nicely done, proper titles and numbering make it easy to keep track of what you are talking about. Each single topic was self-contained, one could understand it without backtracking through the rest of the documentation. In all of your complicated cases, (the regex, the triplets, and your XML code) you used examples that made it easy to understand. Especially the XML code profits from having a properly structured example. Also you went beyond the scope of the project, but you argued your interest in it and showed also the simplicity of the alternative. However, considering this is the part that other groups probably have the least background knowledge of, this part is very dense in information. Perhaps a bit more explanation, or maybe splitting this into sub paragraphs would make this easier to understand. Also, the first time we read your documentation, we didn't quite get what your wanted to convey when you talked about the regex that excluded gerunds. Only after we read the paragraph about false positives with "in" we understood your point. Perhaps if you put this paragraph first, and then talk about excluding one of the regular false positives with this regex? Having a short look at your code, it looks quite good. Most variable or method names are well-chosen and speak for themselves, with the only exception being recognize_ne. And most of your code gives regular updates on what it currently doing/has just done (i.e. print("Article number " + str(n) + "tokeinzed into sentences successfully!").) Small point of personal preference, we'd avoid putting more than one or two lines of code in the main function, just put that into a function that you call from the main.

cstenkamp commented 5 years ago

Week 6:

Concerning the technical accuracy of your sample XML from 6.2 - either you forgot to change the subject_question and the two object_questions, or your dataset is very redundant and also very weird - why save the very same questions a second and third time, even when the answers are now different? I am not sure if I understood it wrong, but in any case I'd delete these parts from the negative-qa-set, especially since they can easily be autogenerated from the correct triple. By the way, concerning your design decisions - why are these questions in the XML in the first place? I don't know if that is the general case, but at least these questions can easily be automatically generated, wouldn't it be less error-prone and more storage-efficient to generate them on demand instead of storing them in the XML? Finally, if these questions are necessary and purposely equal for the negative and positive questions, I'd suggest to link the negative ones to their correct counterpart to save storage. In general, concerning structure of your text - if you mention something you plan on implementing, it helps to say where you plan on getting the data from immediately, or link to the respective section. For example, in section 6.2.1 you talk about using additional information, and in the following section you mention where basically all the additional information you're using is coming from. While it is not too bad since these sections are right behind each other, it'd help to mention where the data is coming from already in the earlier section.

Concerning the problem of noise in your dataset - We currently have the same problem, even though we created our dataset in a completely different way than you, namely by using the underlying information accessed by SparQL. Maybe it would be a good idea for you to use these too, as a sanity-check: If Barack_Obama is referenced to as a single entity (of type person) in SparQL, just like NLTK's NER did, you can be relatively certain that both are right, so you could consider only cases SparQL and the NLTK NER labelled equally. In fact, we're planning on doing a similar version of your approach for our dataset in order to reduce noise in our dataset.

JuAutz commented 5 years ago

As is usual, you properly structured your documentation. As is somewhat unusual, you also structured your git folder somewhat, by using a folder structure to avoid cluttering the central folder with all the files, making it easier to get an overview over your code. Furthermore, you marked deprecated files as such, increasing the order of your repository. The methods of your code made use of self explanatory names, keep doing that! Contentwise, your explanation of why your approach failed, and of your merge with delta were very informative. However, when discussing the split, it would have been nice if you had argued more for why yo choose 70:20:10 or 10-fold, or even what your intuitions were. Also, while understandable that you do not have the complete data-set yet, some preliminary results would have been nice.

AnnaBruns commented 5 years ago

Dear Group Zeta,

Overall, we enjoyed reading your documentation because it was very informative, well structured and written in a comprehensible manner. We noticed no spelling or grammatical mistakes in your text. If you would like to improve a bit more, you could use visualizations.

Documentation for Week 8 Feature Extraction: In this part you talked about the methods you would like to use to extract features from your dataset. You introduced them in a technically accurate and understandable way and explained, why they could be useful for your purposes. Here you did not yet explain how exactly you would like to extract the features and also a concrete link to the next part seems to be missing.

Chapter 9 Feature Extraction: In this part you explained plausibly that you differentiate between source-based and self-contained features and what you mean by that. You structured this part in a meaningful way and talked very concretely about the features you extract from your dataset, how you are doing it, which values they can take and why they are useful for your task. We felt, that the connection between your part 8 and part 9 went missing on the way, which you should maybe overthink before handing in the final version. Also you made no reference to your code, which made it harder to find the implementations of what you have described.

Code: Your code in feature_generation.py and negatives_generation.py is well documented and you used meaningful variable and function names.

ljscfo commented 5 years ago

First of all, the list of the features you are using is great, especially with the examples you give for the features. Additionally, it's good to know that you gave some things that you didn't complete yet some thoughts (like the hyperparameter settings). Here are a couple of things that could be worth improving:

You didn't mention if or how you are selecting features with the methods discussed in the seminar. It would be quite interesting to know which scores some wrapper/filter/embedded-method would give the features you name. Also, it could be relevant for the performance of the classifiers to work only with a promising subset of your features.
In "10.2 Choice of Classifiers", you write about the performance of four different classifiers. However, the heading would suggest something about how you have chosen these four classifiers. Is it because of their better performance with their default parameters or did you just try those four?
"since there were only 7,289 records containing NaNs" - Here it would be interesting to know the absolute number of records to see the percentage of NaN-records in comparison to all records.
"After initially running all the complete dataset with all features through a selection of multiple classifiers, optimizing the process." - Is that supposed to be a sentence or a subheading?
Some typos: "..running all the complete..." - "..running the complete..." "Kohen's Kappa" - "Cohen's Kappa" "Forrest" - "Forest" "We started with gridsearch, think ..." - "thinking"
Those little faces between the Cohen's Kappa and KNN are really cute, but probalby it's just an umrecognized table.
"Decision trees" - isn´t it only one tree?
"44% +ve vs 56% -ve": The internet told us that those abbreviations mean positive and negative, but it doens't seem to be common knowledge, better write it out.