Closed padieul closed 1 year ago
Hi all, we can arrange a meeting on Tuesday, how about 4:00pm? If you'd prefer another time slot, please let me know. HeiConf/Webex is no problem, I will send you a link via Mail once we confirm the date.
Best, Dennis
@padieul @almasria @aabasova @vivkaz could you give me a reply by tomorrow noon? Otherwise I'll consider the meeting "cancelled" and we can re-schedule.
Hello Dennis, For some reasons we did not receive a notification when you wrote your first reply, we just saw your last reply now, sorry for that. I will check with the team and update you.
Hello Dennis, Would it be OK for you to shift the meeting to Thursday at 16:00 ?
Yes, that also works. I'll send the meeting link tomorrow. See you Thursday!
Great, see you Thursday.
Hi all, I'm extremely sorry to realize that I've missed our meeting today. I've been sick today and somehow didn't look at my calendar in the morning.
Again, deeply sorry, I'll get back to you (hopefully tomorrow) about an alternative meeting time. Best, Dennis
Hi Dennis, No problem at all, we will wait for your email. Get well soon!
Regards
Here are some thoughts on the project state which I noticed:
TL;DR:
__pycache__
folders to the gitignore.env
file already contains 2 passwords.yt-spammer-purge
to the "existing code fragments" section, since this seems a rather important tool already existing.README
file to subfolders: What is this folder about? What do the individual scripts do?yt-spammer-purge/
:
git submodule
and outsourced to a different repository. Of course, this is optional in our lecture projects, due to a smaller context. However, there are issues with the licensing of said software (GPL-3), which you need to adopt for the entirety of your project based on GPL's licensing terms, if you decide to keep your tool publicly available.yt-spammer-purge
has a client_secrets.json
file, which should probably not be on Github. The same goes for the logs/
folder. Make sure to purge the git history of the client_secrets.json
file, otherwise it will still be "there" in the git logs. (see, e.g., this answer for help: https://stackoverflow.com/questions/43762338/how-to-remove-file-from-git-history)models/
:
.gitignore
the .vscode/
folder.models/data/client_secret*
file.data/
is explicitly stored in the repository. If this is for demonstration purposes only, I would make it an independent top-level folder. There also seem to exist multiple versions of the same file in different variants (e.g., csv
vs json
and so on). Can you remove duplicates and only have the "relevant" ones be stored on Github?collaborative/recom_train.py
has almost zero comments and no if __name__ == "__main__":
block. Please adjust this accordingly and give a basic description of what it trains..ipynb
scripts actually in use? If so, consider converting them to an actual .py
scripts. IMO, Jupyter Notebooks are primarily for exploratory tasks (and should thus be moved outside of the "main" project scope.middleware/
:
Preprocessing pipeline for given comment
, you could write what the preprocessing step actually does: Preprocessing pipeline for a given comment: Provides spacy tokenization and lemmatization, and removes stopwords.
.print()
ed out, should probably be a logger; again, this is something for a real project, and can be left "as is" in your current project.data_retriever.py
file.data_retriever.py
. Also, there are a lot of TODOs in this file.frontend/
:
console.log
statements should be clarified (why !!!!!!!
?) and are probably not ideal as console.log
.iframes
is probably appropriate to show Kibana, I'm not sure if they are still considered "bad practice" by some browsers. If it works for your browsers, all good, but otherwise make sure to check at least with Chrome/Firefox for compatibility.Overall, I think the project is in a pretty good shape (excited to see a demo from your end during the meeting hopefully!). I think primarily the task should now be to "clean up" the project folders and clarify on "what is where and why" by adding documentation/README files.
Given that you actually did run some experiments (at least that's what I understand from the Jupyter Notebooks in the models/
folder), then it should definitely be added in the README. Generally some of your choices should also be documented there.
Personally, I think the presentation style highly depends on what your project focused on. Generally, I would say there are two main focus areas that I see in 99% of the projects: Either you try to push the "scientific method" in a particular task, but have otherwise little to report in terms of "marketable product" (e.g., frontend/end-to-end project flow). In such cases, I would recommend a presentation (Powerpoint/LaTeX, etc.). Powerpoint nowadays also offers a voice-over recording functionality, which is sufficient for that.
Probably better suited to your project is simply a quick demo, which you can then mix with some basic flowchart, outlining the inner workings of your project. Here, it is primarily important to focus on what problem you are trying to address. I think this should be fairly easy for your project, which has a clear "product focus" and immediate value for listeners.
For recording, I generally recommend OBS, which is nowadays fairly easy to set up, and has tons of options for customization as well. I would recommend to have only one speaker in the presentation, since this makes recording much easier; note that the grade will not be dependent on who is speaking in the presentation (I assume you all worked on it to some degree).
Really try to sell the points of what your project "solves" as a motivator first. With this, the other aspects ("technical creativity", "end result") are much less important, because it is still apparent why you are doing this, and individual components can always be improved (in some way or another). So, the basic flow should look like this:
Strong opening motivation (~1 min spent on the "why") -> Quick flowchart of architecture and explanation of what data was used (no more than 1-2 mins) -> Demo which showcases some "end-to-end" use case (remaining time, ideally ~2 mins)
Hi Dennis!
we would like to discuss the current state of our project. We mainly have the following questions:
Would it be possible to meet online with you via Heiconf?
Best Regards Team SpamScanner