Open ireneisdoomed opened 1 month ago
Describe the bug There are 1156 terms in EFO that have a mapping to multiple MeSH terms. The majority of them (1123) are duplicates.
Observed behaviour See ticket opened to EFO for context https://github.com/EBISPOT/efo/issues/2308
Expected behaviour A clear and concise description of what you expected to happen.
To Reproduce Code provided by @Juanmaria-rr
faulty = ( diseases.withColumn( "mesh1_flag", F.expr("filter(dbXRefs, x -> x rlike 'MeSH')")[ 0 ], # Extract the matching element ) .withColumn( "MESH2_flag", F.expr("filter(dbXRefs, x -> x rlike 'MESH')")[ 0 ], # Extract the matching element ) .withColumn("cleaned_mesh1", F.regexp_replace(F.col("mesh1_flag"), "(?i)MeSH:", "")) .withColumn("cleaned_mesh2", F.regexp_replace(F.col("MESH2_flag"), "(?i)MeSH:", "")) .withColumn( "equalOrNot", F.when( (F.col("cleaned_mesh1").isNotNull()) & (F.col("cleaned_mesh2").isNotNull()), F.when( F.col("cleaned_mesh1") == F.col("cleaned_mesh2"), F.lit("equal") ).otherwise(F.lit("diferent")), ) .when( (F.col("cleaned_mesh1").isNotNull()) & (F.col("cleaned_mesh2").isNull()), F.lit("mesh1"), ) .when( (F.col("cleaned_mesh1").isNull()) & (F.col("cleaned_mesh2").isNotNull()), F.lit("mesh2"), ) .when( (F.col("cleaned_mesh1").isNull()) & (F.col("cleaned_mesh2").isNull()), F.lit("noData"), ), ) .selectExpr("id", "cleaned_mesh1 as MeSH", "cleaned_mesh2 as MESH", "equalOrNot") .filter((F.col("cleaned_mesh1").isNotNull()) & (F.col("cleaned_mesh2").isNotNull())) )
Additional context I think this is not new, because the FE handles these cases by displaying them together.
Describe the bug There are 1156 terms in EFO that have a mapping to multiple MeSH terms. The majority of them (1123) are duplicates.
Observed behaviour See ticket opened to EFO for context https://github.com/EBISPOT/efo/issues/2308
Expected behaviour A clear and concise description of what you expected to happen.
To Reproduce Code provided by @Juanmaria-rr
Additional context I think this is not new, because the FE handles these cases by displaying them together.