scr0-0ge / Da3018-vt2023-project

Project: a genome assembly graph by Xingyi Chen
1 stars 0 forks source link

Feedback #13

Open arvestad opened 1 year ago

arvestad commented 1 year ago

Hej,

Här är min projektåterkoppling. Bra jobbat!

Feedback

Nice work with high ambitions. There is one flaw and that is that you did not compute edge density per graph component, rather (if I read it correctly) as per vertex neighborhood (vertices within one edge away). The outcome is still surprising to me, with quite low neighborhood edge densities, as compared to the expected case with many components with close to density 1.

Minor things:

Please use standard names on folders. "All code" ought to be named "src", "all csv file" ought to be "data", and "lab paperwork" ought to be "doc".

You write that the time complexity of DFS "is something to be mindful of", but you cannot really get a better method, so it is rather the implementation of it and the data handling that requires caution.

Multithreading can speed up computations, but I think its utility is limited when one has to work on a single large datastructure. I guess your threads had its own copy of the graph, since your 64 GB of RAM filled up, and that is certainly one way to go, but then you have to find ways to synchronize what the threads work on.

Good conclusion about the benefit of knowing more than one tool. I would like to admit that I myself was using Python when I first started working with this data, and the performance sufficed. I had 16 GB of RAM to use, which is probably necessary with Python, and more than I can expect students to have.

I am glad you enjoyed working on the project!

Review of grading criteria

Solving the computational problems.

Yes, although with a mistake. 1p

You have used a git repository at github.com in a skilled way.

Yes. 1p

Demonstrated application of algorithm techniques from the course.

Yes. 1p

Demonstrated ability to discuss and reflect on algorithm characteristics

Yes. 1p

A useful lab notebook.

You have written a lot and seem to have gathered useful information. It was however hard for me to follow since the days were in a strange order, and for example "Day 0" and "Day 1" appearing twice. 0.5p

scr0-0ge commented 1 year ago

Hi!

Thank for the feedback!

Sorry for the inconvenient when it comes to lab note,I totally forgot to separate them in 2 different file.

The reasons Day 0 and Day 1 appearing twice is because I combined my old lab note with the new one. The old one is about my attempt to complete the entire project by only using Python ,which does not meet the requirement.

After I realized that, I kinda started every thing from scratch and also write a new lab note book. This is why there is two "Day 0,1,2".

I personally like minimalistic work interfaces and usually combine stuff together so there is less file show upp in my folder. But it seems like I shot myself in the foot by doing that, I totally forgot to separating them in different .md file before I submit this project.

Lars Arvestad @.***>于2023年6月14日 周三22:13写道:

Hej,

Här är min projektåterkoppling. Bra jobbat! Feedback

Nice work with high ambitions. There is one flaw and that is that you did not compute edge density per graph component, rather (if I read it correctly) as per vertex neighborhood (vertices within one edge away). The outcome is still surprising to me, with quite low neighborhood edge densities, as compared to the expected case with many components with close to density 1.

Minor things:

Please use standard names on folders. "All code" ought to be named "src", "all csv file" ought to be "data", and "lab paperwork" ought to be "doc".

You write that the time complexity of DFS "is something to be mindful of", but you cannot really get a better method, so it is rather the implementation of it and the data handling that requires caution.

Multithreading can speed up computations, but I think its utility is limited when one has to work on a single large datastructure. I guess your threads had its own copy of the graph, since your 64 GB of RAM filled up, and that is certainly one way to go, but then you have to find ways to synchronize what the threads work on.

Good conclusion about the benefit of knowing more than one tool. I would like to admit that I myself was using Python when I first started working with this data, and the performance sufficed. I had 16 GB of RAM to use, which is probably necessary with Python, and more than I can expect students to have.

I am glad you enjoyed working on the project! Review of grading criteria Solving the computational problems.

Yes, although with a mistake. 1p You have used a git repository at github.com in a skilled way.

Yes. 1p Demonstrated application of algorithm techniques from the course.

Yes. 1p Demonstrated ability to discuss and reflect on algorithm characteristics

Yes. 1p A useful lab notebook.

You have written a lot and seem to have gathered useful information. It was however hard for me to follow since the days were in a strange order, and for example "Day 0" and "Day 1" appearing twice. 0.5p

— Reply to this email directly, view it on GitHub https://github.com/scr0-0ge/Da3018-vt2023-project/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATUR3UZ74E3LY6F7FFTGMADXLHBGLANCNFSM6AAAAAAZGN5AW4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

arvestad commented 1 year ago

Don’t worry about it. I would however argue that totally starting from scratch is part of the experience, so to speak, so I would recommend that you, in general, do not restart documentation.

Have a nice summer! L

14 juni 2023 kl. 17:00 skrev scr0-0ge @.**@.>>:

Hi!

Thank for the feedback!

Sorry for the inconvenient when it comes to lab note,I totally forgot to separate them in 2 different file.

The reasons Day 0 and Day 1 appearing twice is because I combined my old lab note with the new one. The old one is about my attempt to complete the entire project by only using Python ,which does not meet the requirement.

After I realized that, I kinda started every thing from scratch and also write a new lab note book. This is why there is two "Day 0,1,2".

I personally like minimalistic work interfaces and usually combine stuff together so there is less file show upp in my folder. But it seems like I shot myself in the foot by doing that, I totally forgot to separating them in different .md file before I submit this project.

Lars Arvestad @.***>于2023年6月14日 周三22:13写道:

Hej,

Här är min projektåterkoppling. Bra jobbat! Feedback

Nice work with high ambitions. There is one flaw and that is that you did not compute edge density per graph component, rather (if I read it correctly) as per vertex neighborhood (vertices within one edge away). The outcome is still surprising to me, with quite low neighborhood edge densities, as compared to the expected case with many components with close to density 1.

Minor things:

Please use standard names on folders. "All code" ought to be named "src", "all csv file" ought to be "data", and "lab paperwork" ought to be "doc".

You write that the time complexity of DFS "is something to be mindful of", but you cannot really get a better method, so it is rather the implementation of it and the data handling that requires caution.

Multithreading can speed up computations, but I think its utility is limited when one has to work on a single large datastructure. I guess your threads had its own copy of the graph, since your 64 GB of RAM filled up, and that is certainly one way to go, but then you have to find ways to synchronize what the threads work on.

Good conclusion about the benefit of knowing more than one tool. I would like to admit that I myself was using Python when I first started working with this data, and the performance sufficed. I had 16 GB of RAM to use, which is probably necessary with Python, and more than I can expect students to have.

I am glad you enjoyed working on the project! Review of grading criteria Solving the computational problems.

Yes, although with a mistake. 1p You have used a git repository at github.comhttp://github.com in a skilled way.

Yes. 1p Demonstrated application of algorithm techniques from the course.

Yes. 1p Demonstrated ability to discuss and reflect on algorithm characteristics

Yes. 1p A useful lab notebook.

You have written a lot and seem to have gathered useful information. It was however hard for me to follow since the days were in a strange order, and for example "Day 0" and "Day 1" appearing twice. 0.5p

— Reply to this email directly, view it on GitHub https://github.com/scr0-0ge/Da3018-vt2023-project/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATUR3UZ74E3LY6F7FFTGMADXLHBGLANCNFSM6AAAAAAZGN5AW4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://github.com/scr0-0ge/Da3018-vt2023-project/issues/13#issuecomment-1591405732, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAIQWBVK3YDUB4YLXWPHGDTXLHGZHANCNFSM6AAAAAAZGN5AW4. You are receiving this because you authored the thread.Message ID: @.***>