maximilianh / cellBrowser

main repo: https://github.com/ucscGenomeBrowser/cellBrowser/ - Python pipeline and Javascript scatter plot library for single-cell datasets, http://cellbrowser.rtfd.org
https://github.com/ucscGenomeBrowser/cellBrowser/
GNU General Public License v3.0
105 stars 41 forks source link

cbBuild error, can't recognize Louvain_Cluster vs Louvain Cluster #79

Closed matthewspeir closed 5 years ago

matthewspeir commented 5 years ago

When you run 'cbTool metaCat' it replaces any spaces in the original field names with underscores, '_'. For example, 'Louvain Cluster' would become 'Louvain_Cluster' in the output file.

Now, if you use this new metadata file as input to cbBuild, you get an error like:

ERROR:root:Config statement 'clusterField' contains an invalid field name, 'Louvain Cluster'. Valid meta field names are: donorkey, genes_detected, cellId, UMI_Count, Louvain_Cluster, Percent_Mitochond_, ethnicity_ontology, ethnicity_label, Expressed_Genes, bundle_uuid

cbBuild command:

cbBuild -o ~/public_html/cb_pancreas_v5 -p 8896

Input dir: /hive/users/mspeir/cellbrowserTest/pancreas/new_metadata_test/matrix_files/cbScanpyOut_pancreas_aging

In this example, if you change 'Louvain_Cluster' to 'Louvain Cluster in meta.meta_cat.tsv and re-run cbBuild, it completes without any errors.

Based on the error, it looks like it should recognize 'Louvain_Cluster' as it's listed under the 'valid meta field names'.

maximilianh commented 5 years ago

Oh darn. Yes. Spaces create problems in R. And the underscore will be replaced with space in the UI but I hadn’t thought that the default config does have an underscore. Ok I’ll fix this, it’s not hard. Thanks!

On Thu 7 Mar 2019 at 22:09, Matt Speir notifications@github.com wrote:

When you run 'cbTool metaCat' it replaces any spaces in the original field names with underscores, '_'. For example, 'Louvain Cluster' would become 'Louvain_Cluster' in the output file.

Now, if you use this new metadata file as input to cbBuild, you get an error like:

ERROR:root:Config statement 'clusterField' contains an invalid field name, 'Louvain Cluster'. Valid meta field names are: donorkey, genes_detected, cellId, UMI_Count, Louvain_Cluster, PercentMitochond, ethnicity_ontology, ethnicity_label, Expressed_Genes, bundle_uuid

cbBuild command:

cbBuild -o ~/public_html/cb_pancreas_v5 -p 8896

Input dir:

/hive/users/mspeir/cellbrowserTest/pancreas/new_metadata_test/matrix_files/cbScanpyOut_pancreas_aging

In this example, if you change 'Louvain_Cluster' to 'Louvain Cluster in meta.meta_cat.tsv and re-run cbBuild, it completes without any errors.

Based on the error, it looks like it should recognize 'Louvain_Cluster' as it's listed under the 'valid meta field names'.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/79, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TSfA3B-PsAyZ5ElxAmrTx8IJJFEWks5vUYAJgaJpZM4bkHUD .

matthewspeir commented 5 years ago

Ahh, I didn't even notice that there was a 'clusterField' setting in the cellbrowser.conf. Maybe this is user error then?

maximilianh commented 5 years ago

Well, the program shouldn’t replace spaces, at least not here. Yes, you can fix it by changing the conf but it’d be easier not having to change it.

On Thu 7 Mar 2019 at 22:25, Matt Speir notifications@github.com wrote:

Ahh, I didn't even notice that there was a 'clusterField' setting in the cellbrowser.conf. Maybe this is user error then?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/79#issuecomment-470700521, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TTzbSqe6w9lrlpFY7uvbyeHtM7X6ks5vUYPDgaJpZM4bkHUD .

maximilianh commented 5 years ago

OK, first there is now a new option to metaCat, --first, so it's easier to change the sort order, you don't need awk anymore.

On Thu, Mar 7, 2019 at 10:26 PM Maximilian Haeussler maximilianh@gmail.com wrote:

Well, the program shouldn’t replace spaces, at least not here. Yes, you can fix it by changing the conf but it’d be easier not having to change it.

On Thu 7 Mar 2019 at 22:25, Matt Speir notifications@github.com wrote:

Ahh, I didn't even notice that there was a 'clusterField' setting in the cellbrowser.conf. Maybe this is user error then?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/79#issuecomment-470700521, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TTzbSqe6w9lrlpFY7uvbyeHtM7X6ks5vUYPDgaJpZM4bkHUD .

maximilianh commented 5 years ago

OK, this seems to be fixed now, metaCat does not change field names anymore. Took me a bit to change this, as Python does this by default with fields. There may be other places where spaces in field names are secretly changed to underscores, but it's easy to change these places now.

On Mon, Mar 11, 2019 at 12:45 PM Maximilian Haeussler maximilianh@gmail.com wrote:

OK, first there is now a new option to metaCat, --first, so it's easier to change the sort order, you don't need awk anymore.

On Thu, Mar 7, 2019 at 10:26 PM Maximilian Haeussler < maximilianh@gmail.com> wrote:

Well, the program shouldn’t replace spaces, at least not here. Yes, you can fix it by changing the conf but it’d be easier not having to change it.

On Thu 7 Mar 2019 at 22:25, Matt Speir notifications@github.com wrote:

Ahh, I didn't even notice that there was a 'clusterField' setting in the cellbrowser.conf. Maybe this is user error then?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/79#issuecomment-470700521, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TTzbSqe6w9lrlpFY7uvbyeHtM7X6ks5vUYPDgaJpZM4bkHUD .

maximilianh commented 5 years ago

release 0.4.50.

On Mon, Mar 11, 2019 at 1:00 PM Maximilian Haeussler maximilianh@gmail.com wrote:

OK, this seems to be fixed now, metaCat does not change field names anymore. Took me a bit to change this, as Python does this by default with fields. There may be other places where spaces in field names are secretly changed to underscores, but it's easy to change these places now.

On Mon, Mar 11, 2019 at 12:45 PM Maximilian Haeussler < maximilianh@gmail.com> wrote:

OK, first there is now a new option to metaCat, --first, so it's easier to change the sort order, you don't need awk anymore.

On Thu, Mar 7, 2019 at 10:26 PM Maximilian Haeussler < maximilianh@gmail.com> wrote:

Well, the program shouldn’t replace spaces, at least not here. Yes, you can fix it by changing the conf but it’d be easier not having to change it.

On Thu 7 Mar 2019 at 22:25, Matt Speir notifications@github.com wrote:

Ahh, I didn't even notice that there was a 'clusterField' setting in the cellbrowser.conf. Maybe this is user error then?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/79#issuecomment-470700521, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TTzbSqe6w9lrlpFY7uvbyeHtM7X6ks5vUYPDgaJpZM4bkHUD .

matthewspeir commented 5 years ago

Cool, I think is now fixed.