caporaso-lab / mockrobiota

A public resource for microbiome bioinformatics benchmarking using artificially constructed (i.e., mock) communities.
http://mockrobiota.caporasolab.us
BSD 3-Clause "New" or "Revised" License
77 stars 35 forks source link

Permanent "issues" or notes pages for each mock community dataset #42

Closed nbokulich closed 8 years ago

nbokulich commented 8 years ago

We should create a permanent place for users to archive notes/tips/known issues for each dataset. E.g., some "issues" may not be errors, per se, but rather things like whether mapping barcodes need to be reverse complemented for demultiplexing, or whether header lines in raw data cause issues with specific software. Some of these may not be "issues" that need to be corrected, but rather documented in a permanent place for future users to follow. On the other hand, various other observations may be useful to share but are not "issues", e.g., whether specific samples in a dataset have low read counts post-QC and should be ignored.

@gregcaporaso what do you think?

I see 2 possibilities:

1) create an issue request for each mock community and leave that issue open permanently. This will keep issues associated with a single MC organized in one place, and more comments can be added to that page as more notes/observations are made. It has the advantage that notes are added as comments without the need of a PR, so streamlines the process. It is disadvantaged by the fact that real issues will be intermixed with notes (even if we separate the "notes" page from real issues, things may get messy as they already are!), and the issue page cannot be closed when real issues are solved. Just having separate issues could get long and messy.

2) create a notes file in the main directory for each dataset. Users will need to submit a PR to add permanent notes, though this could also help keep things tidy. Another disadvantage is that users would need to go looking for this, and the issues page is where most users will already be searching to find known issues.

gregcaporaso commented 8 years ago

I think this is a good idea, but I strongly prefer the second option. The reason being that everything is then in the repository, so if it were to ever move (e.g., from GitHub), or is being accessed offline, all of the relevant information would be included. GitHub issues also aren't great for this kind of thing because as they get long, valuable information can get lost in long discussion (where if the information is in a file, it's easy to make it more apparent). Also, note that it is possible to edit files and submit pull requests on GitHub in the web browser (via the file edit links), which could be used to simplify getting notes added.

nbokulich commented 8 years ago

Perhaps a README.md in the base directory for each dataset would be a good way to do this? These files could also contain the human-readable-description or other info and list known issues at the bottom.

gregcaporaso commented 8 years ago

That would work.

On Thu, Sep 8, 2016 at 12:55 PM, Nicholas Bokulich <notifications@github.com

wrote:

Perhaps a README.md in the base directory for each dataset would be a good way to do this? These files could also contain the human-readable-description or other info and list known issues at the bottom.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/caporaso-lab/mockrobiota/issues/42#issuecomment-245720113, or mute the thread https://github.com/notifications/unsubscribe-auth/AALvdFFImRFVfTzM0Y7IvoJgvoOb8OdRks5qoGhMgaJpZM4J4R6c .

nbokulich commented 8 years ago

"Known issues / notes" are now included on the README.md page for each dataset.