Closed qiyunzhu closed 7 years ago
Great, this will be very helpful! Excited to see these wet lab tools moving.
On Jan 5, 2017, at 11:11 AM, Qiyun Zhu notifications@github.com<mailto:notifications@github.com> wrote:
Hello all @antgonzahttps://github.com/antgonza @josenavashttps://github.com/josenavas @mortonjthttps://github.com/mortonjt @sjanssen2https://github.com/sjanssen2
This PR contains a new function, which checks if the sample names contained in the plate map are identical to a given sample name list. Currently, Gail sends the sample name list to the wet lab, and later manually validates the sample names in the mapping file she gets back from the wet lab. This process is quite time consuming. Therefore I implemented this new function after talking with Gail and Greg.
Specifically, this new function will pick out the sample names that 1) are present in the mapping file but not in the name list, 2) are present in the name list but not in the mapping file, 3) have duplicates in the mapping file.
When any of these instances are detected, the program will display a warning message to the user (wet lab). Note that it is a warning instead of an error, because in many cases the user intensionally makes these exceptions (e.g., adding a blank sample, excluding a certain sample). The user will then examine the warning message to see if everything is expected.
For the whole upgrade plan please refer to my email on Jan 5.
Meanwhile, I addressed @mortonjthttps://github.com/mortonjt 's suggestion: moving test data from the script to the external file system. You can find the files in the data/ directory.
Thanks and happy new year!
You can view, comment on, or merge this pull request online at:
https://github.com/biocore/wetlab-assistant/pull/12
Commit Summary
File Changes
Patch Links:
- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/biocore/wetlab-assistant/pull/12, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB69KcXPIW-E14RwMlYw9yeBmj8iar_gks5rPUBNgaJpZM4LcBbd.
Do you think it is worthwhile to include the case where a user has multiple spaces in addition to tabs? It is common to accidentally embed extra spaces in excel
@mortonjt That's a good question! I guess some column headers and (mistyped) sample names may contain spaces, which will complicate the situation. Do you suggest we should treat continuous spaces as a tab, or we should trim off leading and trailing spaces of each cell?
@antgonza Thank you very much for the very careful comments! I will address them one by one. I am completely fine with coveralls. How can we add this?
@qiyunzhu re splitting whitespace
You can split arbituary whitespace with split
In [1]: s = "s sss \t t"
In [2]: s.split()
Out[2]: ['s', 'sss', 't']
I know that pandas can have this issue when parsing tables.
But I would need a little more context to understand how the wetlab is using this information. Are they working with excel, or just plain text?
@mortonjt The way they work is that they copy & paste the Google Sheet / Excel contents into a text file. This will guarantee that cells are separated by tabs. The script will work on the text files. Meanwhile, some cells may contain spaces (this is normal, I hope not but I guess it happens). Therefore simple s.split()
may not work properly.
Ok - in that case, it is probably good for now. If there are problems - we can patch them as appropriate.
@antgonza @mortonjt Now I figured out a better way: Use Python's warnings
module instead of plain print
. Warnings are easily testable. Meanwhile I put "validate" back to the main function, because now I can test warning instead of function return.
Looks good to me. @antgonza want to do one last look before merging?
just two minor comments
@mortonjt @wasade @antgonza anyone would like to merge? Thanks!
@antgonza @mortonjt @wasade Thank you very much!!
Hello all @antgonza @josenavas @mortonjt @sjanssen2
This PR contains a new function, which checks if the sample names contained in the plate map are identical to a given sample name list. Currently, Gail sends the sample name list to the wet lab, and later manually validates the sample names in the mapping file she gets back from the wet lab. This process is quite time consuming. Therefore I implemented this new function after talking with Gail and Greg.
Specifically, this new function will pick out the sample names that 1) are present in the mapping file but not in the name list, 2) are present in the name list but not in the mapping file, 3) have duplicates in the mapping file.
When any of these instances are detected, the program will display a warning message to the user (wet lab). Note that it is a warning instead of an error, because in many cases the user intensionally makes these exceptions (e.g., adding a blank sample, excluding a certain sample). The user will then examine the warning message to see if everything is expected.
For the whole upgrade plan please refer to my email on Jan 5.
Meanwhile, I addressed @mortonjt 's suggestion: moving test data from the script to the external file system. You can find the files in the data/ directory.
Thanks and happy new year!